Identification and characterization of plant genes

ABSTRACT

The invention discloses a set of genes the expression products of which are up-regulated during the grain filling process in rice and active in different metabolic pathways involved in nutrient partitioning. The invention also discloses the use of said genes to modify the compositional and nutritional characteristics of the plant grain.

The present invention is in the area of plant biotechnology. Inparticular, the invention relates to a set of genes the expressionproducts of which are up-regulated during the grain filling process inrice and active in different metabolic pathways involved in nutrientpartitioning. The invention also relates to the use of said genes tomodify the compositional and nutritional characteristics of the plantgrain.

It has been long recognized that the value of agricultural products suchas cereal grains and the like are affected by the quality of theirinherent constituent components: In particular, cereal grains withimproved protein, oil, starch, fiber, and moisture content and desirablelevels of carbohydrates and other constituents are of economic interest.

In rice, for example, yield, nutritional characteristics and eatingquality are the most important economic traits. The first two traits aremostly determined by the composition and accumulation of carbohydrates,proteins, and minerals during grain filling, and the latter by theinteraction of various Xs enzymes to produce the final structure of thestarch at the molecular and granule levels. Manipulation of thesepathways results in significant improvement in the nutritional value.For example, reduction of the amounts of even one enzyme, granule-boundstarch synthase, in the starch biosynthetic pathway can dramaticallyaffect the eating quality, resulting in softer, less sticky cooked rice.Some genes participating in nutrient partitioning during rice grainfilling and affecting starch quality have been previously identified.However, genes participated in these processes and their transcriptionalcontrols are poorly understood.

Within the scope of the present invention a set of genes is now providedwhich were shown to be involved in the grain filling process based ontheir mRNA expression characteristics. The genes within this subset arepreferentially up-regulated and share a similar expression patternduring the process of grain filling. The expression levels of thosegenes increase synchronously during grain development while the encodedgene products are active in different pathways. The genes within thissubgroup, representative examples of which are provided in the SequenceListing, are thus useful tools for generating plants which produce grainwith modified compositional characteristics leading to improvednutritional properties.

One of the main objectives of the present invention is thus to provide apolynucleotide comprising a nucleotide sequence encoding a polypeptidethe expression of which is up-regulated during grain filling and the useof said molecule for modifying the nutritional composition and qualityof plant grain.

The majority of the genes within this group encode protein products thatare directly involved in or associated with three major pathways ofnutrition partitioning: the synthesis and transport of (1)carbohydrates, (2) proteins, and (3) fatty acids.

The most dramatic increase in relative mRNA expression levels is shownby those genes whose products control the synthesis of carbohydrates andproteins and can be found in the endosperm of the developing seed, whichis the main sink for plant nutrients.

The other group of genes which shows a significant increase in relativemRNA expression levels comprises genes that are involved in and incontrol of fatty acid biosynthesis. These genes have a more balancedexpression between the embryo and endosperm.

In one embodiment the invention thus relates to a subset of isolatednucleic acid molecules comprising a nucleotide sequence encoding apolypeptide that is involved in at least one of the major pathways ofnutrition partitioning selected from the group consisting of synthesis,transport, metabolism or degradation of carbohydrates, proteins, andfatty acids.

Another subset of nucleic acid molecules provided herein comprises anumber of nucleic acids that encode different transporters, such assugar transporters, ABC transporters, amino acid/peptide transporters,phosphate transporters, and nitrate transporters.

Still another subset of nucleic acid molecules that is provided as partof the invention comprises nucleic acid molecules that are involved inthe transcriptional control of the highly coordinated grain fillingprocess.

Further subsets of nucleic acid molecules provided herein comprisenucleic acid molecules the expression products of which are associatedwith amino acid metabolism; signal transduction; and stress regulation,respectively.

In a collective embodiment applicable to all of the nucleic acidmolecules disclosed herein, the invention relates to the use of thenucleic acid molecules according to the invention as hybridizationprobes, for chromosome and gene mapping, in PCR technologies, in theproduction of sense or antisense nucleic acids, in screening for newtherapeutic molecules, in production of plants and seeds havingdesirable, inheritable, commercially useful phenotypes, or in discoveryof inhibitory compounds.

The invention further relates to any polypeptides encoded by the nucleicacid molecules according to the invention, or any antigene sequencesthereof, which have numerous applications using techniques that areknown to those skilled in the art of molecular biology, biotechnology,biochemistry, genetics, physiology or pathology.

In a further collective embodiment, the present invention provides theability to modulate the grain filling process, by over-expressing,under-expressing or knocking out one or more of the genes disclosedherein or their gene products, in a plant cell, in vitro or in planta.Expression vectors comprising at least one nucleic acid moleculeaccording to the invention, or any antigenes thereof, operably linked toat least one suitable promoter and/or regulatory sequence can be used tostudy the role of polypeptides encoded by said sequences, for example bytransforming a host cell with said expression vector and measuring theeffects of overexpression and underexpression of said nucleic acidmolecules. Suitable promoter and/or regulatory sequences includeespecially those that are preferentially or specifically active in plantgrain tissue such as, for example, the grain endosperm or the grainembryo. A host cell transformed with at least one expression vectorcomprising at least one nucleic acid molecule of the invention, operablylinked to suitable promoters and/or regulatory sequences, can be usefulto produce a plant grain with improved nutritional or dietaryproperties.

In a further collective embodiment, the present invention provides atransformed plant host cell, or one obtained through breeding, capableof over-expressing, under-expressing, or having a knock out of at leastone of the genes according to the invention and/or their gene products.

Such a plant cell, transformed with at least one expression vectorcomprising a nucleic acid molecule of the invention, operably linked tosuitable promoters and/or regulatory sequences, can be used toregenerate plant tissue or an entire plant, or seed there from, in whichthe effects of expression, including overexpression or underexpression,of the introduced sequence or sequences can be measured in vitro or inplanta.

In a further embodiment the present invention provides nucleotidesequences including regions of nucleotide sequence encoding polypeptideshaving homology to at least one functional protein domain (FPD).Embodiments of the invention further provide polypeptides includingregions of amino acid sequence having homology to an FPD. In cases wherethe polypeptide has homology to an FPD in the same or closely relatedspecies, the polypeptide may represent a paralogous sequence or paralog,or may represent a variant allele of a gene encoding the FPD. In caseswhere the polypeptide has homology to an FPD in another species,including other plant species and especially non-plant species,polypeptides may represent orthologous sequences, or orthologs, of theFPD.

In a further collective embodiment of the invention the nucleic acidmolecules disclosed herein or respresentative parts thereof can be usedin hybridization-based assays for detecting and identifying nucleic acidmolecules that encode protein products that are involved in the grainfilling process, more particularly in at least one of the major pathwaysof nutrition partitioning selected from the group consisting ofsynthesis, transport, metabolism or degradation of carbohydrates,proteins, and fatty acids, in plants other than rice, but especially inplants belonging to the cereal group.

Embodiments of the present invention provide a unique oligonucleotidehaving a sequence identical to or complementary to a region of apolynucleotide sequence encoding at least a portion of a homologue of aprotein according to the invention representatives of which areidentified by SEQ ID NOs 2-462, 502-512, and 514-642 provided in theSequence Listing and/or an FPD thereof, the oligonucleotide beingidentified by the methods disclosed herein. In one embodiment, theunique oligonucleotide has a length of between 12 and 250 nucleotidebases.

Embodiments of the present invention also provide a nucleotidemicroarray comprising the unique oligonucleotide having a sequenceidentical to or complementary to a region of polynucleotide sequenceencoding at least a portion of a homologue of a protein according to theinvention representatives of which are identified by SEQ ID NOs: 2-462,502-512, and 514-642 provided in the Sequence Listing and/or an FPDthereof. Preferably, the microarray includes a plurality of different,unique oligonucleotides, the sequences corresponding to a plurality ofhomologues of a protein according to the invention representatives ofwhich are identified by the SEQ ID NOs provided in the Sequence Listingand/or an FPD thereof. Equally preferably, the microarray contains atleast about 96 different unique oligonucleotides, wherein each of the 96different unique oligonucleotides has a sequence that is identical,complementary, or substantial similarity to a segment of a nucleotidesequence as given in SEQ ID NOs: 1-461, 501-511, and 513-641 provided inthe Sequence Listing.

Embodiments of the present invention also provide a kit for detectingthe presence of a polynucleotide, the kit containing a first nucleotideprobe which can hybridize with a region of a nucleotide sequenceincluding the nucleotide sequences of SEQ ID NOs: 1-461 provided in theSequence Listing, a fragment or a variant thereof, and a complementarysequence thereto, the kit further containing at least one additionalcomponent such as, for example: a second nucleotide probe, a buffer, anenzyme, a label, a molecular weight standard, a reaction chamber, and amicropipette tip.

Embodiments of the present invention further provide a kit for detectingthe presence of a polypeptide, the kit containing a first probe whichcan hybridize with a region of a polypeptide including the amino acidsequences of SEQ ID NOs: 2-462, 502-512, and 514-642 provided in theSequence Listing, a fragment or a variant thereof, and optionally, thekit further containing at least one additional component such as, forexample: a probe, a buffer, an enzyme, a label, a molecular weightstandard, a reaction chamber, and a micropipette tip. Probes useful inkit embodiments include antibodies, affinity tags, protein A, protein G,or protein-binding substances including chromatographic media.

An additional aspect provides a method for selecting plants, for examplecereals, having an altered carbohydrate, protein or fatty acid contentand/or composition of the grain comprising obtaining nucleic acidmolecules from the plants to be selected, contacting the nucleic acidmolecules with one or more probes that selectively hybridize understringent or highly stringent conditions to a nucleic acid sequenceselected from the group consisting of SEQ ID NOs. 1-461, 501-511, and513-641; detecting the hybridization of the one or more probes to thenucleic acid sequences wherein the presence of the hybridizationindicates the presence of a gene associated with altered carbohydrate,protein or fatty acid content and/or composition of the grain; andselecting plants on the basis of the presence or absence of suchhybridization. In one embodiment, marker-assisted selection isaccomplished in rice. In another embodiment, marker assisted selectionis accomplished in wheat using one or more probes which selectivelyhybridize under stringent or highly stringent conditions to sequencesselected from the group consisting of SEQ ID NOs. 951-1105. In yetanother embodiment, marker assisted selection is accomplished in maizeor corn using one or more probes which selectively hybridize understringent or highly stringent conditions to sequences selected from thegroup consisting of SEQ ID NOs. 1106-1201. In still another embodiment,marker assisted selection is accomplished in banana using one or moreprobes which selectively hybridize under stringent or highly stringentconditions to sequences selected from the group consisting of SEQ IDNOs. 884-950. In each case marker-assisted selection can be accomplishedusing a probe or probes to a single sequence or multiple sequences. Ifmultiple sequences are used they can be used simultaneously orsequentially.

In a further embodiment of the invention a computer readable mediumcontaining one or more of the nucleotide sequences of the invention isprovided as well as methods of use for the computer readable medium.This medium allows a nucleotide sequence corresponding to at least oneof the sequences selected from the group consisting of SEQ ID NOs:1-461, 501-511, and 513-641 and 884-1201 provided in the SequenceListing (open reading frames or fragments thereof), to be used as areference sequence to search against a database. This medium also allowsfor computer-based manipulation of a nucleotide sequence correspondingto at least one of the sequences selected from the group consisting ofSEQ ID NOs: 1-461, 501-511, and 513-641, 884-1201 provided in theSequence Listing.

Further aspects, features and advantages of this invention will becomeapparent from the detailed description of the preferred embodiments thatfollow.

A further aspect provides a computer readable medium having storedthereon computer executable instructions for performing a methodcomprising receiving data on nucleotide sequence expression in a testplant of at least one nucleic acid molecule having at least 70%, atleast 80%, at least 90% or at least 95%, sequence identity to anucleotide sequence selected from the group consisting of SEQ ID NOs:1-461, 501-511, and 513-641; and 884-1201 and comparing expression datafrom said test plant to expression data for the same nucleotide sequenceor sequences in a plant during grain filling.

BRIEF DESCRIPTION OF THE SEQUENCE LISTING

In the following, a brief description of the sequences in the SequenceListing is provided:

Odd numbered SEQ ID NOs:1-461 are representing a first sub-group(sub-group I) of polynucleotides comprising nucleotide sequences whichencode polypeptides that are up-regulated during grain filling and aredescribed in Tables 1-11 below.

Even numbered SEQ ID NOs:2-462 are protein sequences encoded by theimmediately preceding nucleotide sequence, e.g., SEQ ID NO:2 is theprotein encoded by the nucleotide sequence of SEQ ID NO:1, SEQ ID NO:4is the protein encoded by the nucleotide sequence of SEQ ID NO:3, etc.

Odd numbered SEQ ID NOs: 501-511 are representing a second sub-group(sub-group II) of polynucleotides comprising rice cDNA sequences. Thecorrelation between the sequences in sub-groups I and II is illustratedin Table 13.

Even numbered SEQ ID NOs:502-512 are protein sequences encoded by theimmediately preceding nucleotide sequence.

Odd numbered SEQ ID NOs: 513-641 are representing a third sub-group(sub-group III) of polynucleotides comprising nucleotide sequences thathave homologies between 80% and 99.90% to the nucleotide sequences ofsub-group I and possible variants or familiy members of rice sequencesprovided in SEQ ID NOs: 1-461. The correlation between the sequences insub-groups I and III is illustrated in Table 12.

Even numbered SEQ ID NOs:514-642 are protein sequences encoded by theimmediately preceding nucleotide sequence.

SEQ ID NOs: 643-883 are promoter sequences.

SEQ ID NOs: 884-950 are banana sequences which show homology to rice“grain filling” genes.

SEQ ID NOs: 951-1105 are wheat sequences which show homology to rice“grain filling” genes.

SEQ ID NOs: 1106-1201 are maize sequences which show homology to rice“grain filling” genes.

Definitions

For clarity, certain terms used in the specification are defined andpresented as follows:

The term “gene” is used broadly to refer to any segment of nucleic acidassociated with a biological function. Thus, genes include codingsequences and/or the regulatory sequences required for their expression.For example, gene refers to a nucleic acid fragment that expresses mRNAor functional RNA, or encodes a specific protein, and which includesregulatory sequences. Genes also include nonexpressed DNA segments that,for example, form recognition sequences for other proteins. Genes can beobtained from a variety of sources, including cloning from a source ofinterest or synthesizing from known or predicted sequence information,and may include sequences designed to have desired parameters.

The term “native” or “wild type” gene refers to a gene that is presentin the genome of an untransformed cell, i.e., a cell not having a knownmutation.

A “marker gene” encodes a selectable or screenable trait.

The term “chimeric gene” refers to any gene that contains 1) DNAsequences, including regulatory and coding sequences, that are not foundtogether in nature, or 2) sequences encoding parts of proteins notnaturally adjoined, or 3) parts of promoters that are not naturallyadjoined. Accordingly, a chimeric gene may comprise regulatory sequencesand coding sequences that are derived from different sources, orcomprise regulatory sequences and coding sequences derived from the samesource, but arranged in a manner different from that found in nature.

A “transgene” refers to a gene that has been introduced into the genomeby transformation and is stably maintained. Transgenes may include, forexample, genes that are either heterologous or homologous to the genesof a particular plant to be transformed. Additionally, transgenes maycomprise native genes inserted into a normative organism, or chimericgenes. The term “endogenous gene” refers to a native gene in its naturallocation in the genome of an organism. A “foreign” gene refers to a genenot normally found in the host organism but that is introduced by genetransfer.

An “oligonucleotide” corresponding to a nucleotide sequence of theinvention, e.g., for use in probing or amplification reactions, may beabout 30 or fewer nucleotides in length (e.g., 9, 12, 15, 18, 20, 21 or24, or any number between 9 and 30). Generally specific primers areupwards of 14 nucleotides in length. For optimum specificity and costeffectiveness, primers of 16 to 24 nucleotides in length may bepreferred. Those skilled in the art are well versed in the design ofprimers for use processes such as PCR. If required, probing can be donewith entire restriction fragments of the gene disclosed herein which maybe 100's or even 1000's of nucleotides in length.

The terms “protein,” “peptide” and “polypeptide” are usedinterchangeably herein.

The nucleotide sequences of the invention can be introduced into anyplant. The genes to be introduced can be conveniently used in expressioncassettes for introduction and expression in any plant of interest. Suchexpression cassettes will comprise the transcriptional initiation regionof the invention linked to a nucleotide sequence of interest. Preferredpromoters include constitutive, tissue-specific, development-specific,inducible and/or viral promoters. Such an expression cassette isprovided with a plurality of restriction sites for insertion of the geneof interest to be under the transcriptional regulation of the regulatoryregions. The expression cassette may additionally contain selectablemarker genes. The cassette will include in the 5′-3′ direction oftranscription, a transcriptional and translational initiation region, aDNA sequence of interest, and a transcriptional and translationaltermination region functional in plants. The termination region may benative with the transcriptional initiation region, may be native withthe DNA sequence of interest, or may be derived from another source.Convenient termination regions are available from the Ti-plasmid of A.tumefaciens, such as the octopine synthase and nopaline synthasetermination regions. See also, Guerineau et al., 1991; Proudfoot, 1991;Sanfacon et al., 1991; Mogen et al., 1990; Munroe et al., 1990; Ballaset al., 1989; Joshi et al., 1987.

“Coding sequence” refers to a DNA or RNA sequence that codes for aspecific amino acid sequence and excludes the non-coding sequences. Itmay constitute an “uninterrupted coding sequence”, i.e., lacking anintron, such as in a cDNA or it may include one or more introns boundedby appropriate splice junctions. An “intron” is a sequence of RNA whichis contained in the primary transcript but which is removed throughcleavage and re-ligation of the RNA within the cell to create the maturemRNA that can be translated into a protein.

The terms “open reading frame” and “ORF” refer to the amino acidsequence encoded between translation initiation and termination codonsof a coding sequence. The terms “initiation codon” and “terminationcodon” refer to a unit of three adjacent nucleotides (‘codon’) in acoding sequence that specifies initiation and chain termination,respectively, of protein synthesis (mRNA translation).

A “functional RNA” refers to an antisense RNA, ribozyme, or other RNAthat is not translated.

The term “RNA transcript” refers to the product resulting from RNApolymerase catalyzed transcription of a DNA sequence. When the RNAtranscript is a perfect complementary copy of the DNA sequence, it isreferred to as the primary transcript or it may be a RNA sequencederived from posttranscriptional processing of the primary transcriptand is referred to as the mature RNA.

“Messenger RNA” (mRNA) refers to the RNA that is without introns andthat can be translated into protein by the cell. “cDNA” refers to asingle- or a double-stranded DNA that is complementary to and derivedfrom mRNA.

“Regulatory sequences” and “suitable regulatory sequences” each refer tonucleotide sequences located upstream (5′ non-coding sequences), within,or downstream (3′ noncoding sequences) of a coding sequence, and whichinfluence the transcription, RNA processing or stability, or translationof the associated coding sequence. Regulatory sequences includeenhancers, promoters, translation leader sequences, introns, andpolyadenylation signal sequences. They include natural and syntheticsequences as well as sequences which may be a combination of syntheticand natural sequences. As is noted above, the term “suitable regulatorysequences” is not limited to promoters.

“5′ noncoding sequence” refers to a nucleotide sequence located 5′(upstream) to the coding sequence. It is present in the fully processedmRNA upstream of the initiation codon and may affect processing of theprimary transcript to mRNA, mRNA stability or translation efficiency(Turner et al., 1995).

“3′ non-coding sequence” refers to nucleotide sequences located 3′(downstream) to a coding sequence and include polyadenylation signalsequences and other sequences encoding regulatory signals capable ofaffecting mRNA processing or gene expression. The polyadenylation signalis usually characterized by affecting the addition of polyadenylic acidtracts to the 3′ end of the mRNA precursor. The use of different 3′non-coding sequences is exemplified by Ingelbrecht et al., 1989.

The term “translation leader sequence” refers to that DNA sequenceportion of a gene between the promoter and coding sequence that istranscribed into RNA and is present in the fully processed mRNA upstream(5′) of the translation start codon. The translation leader sequence mayaffect processing of the primary transcript to mRNA, mRNA stability ortranslation efficiency.

“Signal peptide” refers to the amino terminal extension of apolypeptide, which is translated in conjunction with the polypeptideforming a precursor peptide and which is required for its entrance intothe secretory pathway. The term “signal sequence” refers to a nucleotidesequence that encodes the signal peptide.

“Promoter” refers to a nucleotide sequence, usually upstream (5′) to itscoding sequence, which controls the expression of the coding sequence byproviding the recognition for RNA polymerase and other factors requiredfor proper transcription. “Promoter” includes a minimal promoter that isa short DNA sequence comprised of a TATA box and other sequences thatserve to specify the site of transcription initiation, to whichregulatory elements are added for control of expression. “Promoter” alsorefers to a nucleotide sequence that includes a minimal promoter plusregulatory elements that is capable of controlling the expression of acoding sequence or functional RNA. This type of promoter sequenceconsists of proximal and more distal upstream elements, the latterelements often referred to as enhancers. Accordingly, an “enhancer” is aDNA sequence which can stimulate promoter activity and may be an innateelement of the promoter or a heterologous element inserted to enhancethe level or tissue specificity of a promoter. It is capable ofoperating in both orientations (normal or flipped), and is capable offunctioning even when moved either upstream or downstream from thepromoter. Both enhancers and other upstream promoter elements bindsequence-specific DNA-binding proteins that mediate their effects.Promoters may be derived in their entirety from a native gene, or becomposed of different elements derived from different promoters found innature, or even be comprised of synthetic DNA segments. A promoter mayalso contain DNA sequences that are involved in the binding of proteinfactors which control the effectiveness of transcription initiation inresponse to physiological or developmental conditions.

The “initiation site” is the position surrounding the first nucleotidethat is part of the transcribed sequence, which is also defined asposition+1. With respect to this site all other sequences of the geneand its controlling regions are numbered. Downstream sequences (i.e.,further protein encoding sequences in the 3′ direction) are denominatedpositive, while upstream sequences (mostly of the controlling regions inthe 5′ direction) are denominated negative.

Promoter elements, particularly a TATA element, that are inactive orthat have greatly reduced promoter activity in the absence of upstreamactivation are referred to as “minimal or core promoters.” In thepresence of a suitable transcription factor, the minimal promoterfunctions to permit transcription. A “minimal or core promoter” thusconsists only of all basal elements needed for transcription initiation,e.g., a TATA box and/or an initiator.

“Constitutive expression” refers to expression using a constitutive orregulated promoter. “Conditional” and “regulated expression” refer toexpression controlled by a regulated promoter.

“Constitutive promoter” refers to a promoter that is able to express theopen reading frame (ORF) that it controls in all or nearly all of theplant tissues during all or nearly all developmental stages of theplant. Each of the transcription-activating elements do not exhibit anabsolute tissue-specificity, but mediate transcriptional activation inmost plant parts at a level of ≧1% of the level reached in the part ofthe plant in which transcription is most active.

“Regulated promoter” refers to promoters that direct gene expression notconstitutively, but in a temporally- and/or spatially-regulated manner,and includes both tissue-specific and inducible promoters. It includesnatural and synthetic sequences as well as sequences which may be acombination of synthetic and natural sequences. Different promoters maydirect the expression of a gene in different tissues or cell types, orat different stages of development, or in response to differentenvironmental conditions. New promoters of various types useful in plantcells are constantly being discovered, numerous examples may be found inthe compilation by Okamuro et al. (1989). Typical regulated promotersuseful in plants include but are not limited to safener-induciblepromoters, promoters derived from the tetracycline-inducible system,promoters derived from salicylate-inducible systems, promoters derivedfrom alcohol inducible systems, promoters derived fromglucocorticoid-inducible system, promoters derived frompathogen-inducible systems, and promoters derived fromecdysome-inducible systems.

“Tissue-specific promoter” refers to regulated promoters that are notexpressed in all plant cells but only in one or more cell types inspecific organs (such as leaves or seeds), specific tissues (such asembryo or cotyledon), or specific cell types (such as leaf parenchyma orseed storage cells). These also include promoters that are temporallyregulated, such as in early or late embryogenesis, during fruit ripeningin developing seeds or fruit, in fully differentiated leaf, or at theonset of senescence.

“Inducible promoter” refers to those regulated promoters that can beturned on in one or more cell types by an external stimulus, such as achemical, light, hormone, stress, or a pathogen.

“Operably-linked” refers to the association of nucleic acid sequences onsingle nucleic acid fragment so that the function of one is affected bythe other. For example, a regulatory DNA sequence is said to be“operably linked to” or “associated with” a DNA sequence that codes foran RNA or a polypeptide if the two sequences are situated such that theregulatory DNA sequence affects expression of the coding DNA sequence(i.e., that the coding sequence or functional RNA is under thetranscriptional control of the promoter). Coding sequences can beoperably-linked to regulatory sequences in sense or antisenseorientation.

“Expression” refers to the transcription and/or translation of anendogenous gene, ORF or portion thereof, or a transgene in plants. Forexample, in the case of antisense constructs, expression may refer tothe transcription of the antisense DNA only. In addition, expressionrefers to the transcription and stable accumulation of sense (mRNA) orfunctional RNA. Expression may also refer to the production of protein.

“Specific expression” is the expression of gene products which islimited to one or a few plant tissues (spatial limitation) and/or to oneor a few plant developmental stages (temporal limitation). It isacknowledged that hardly a true specificity exists: promoters seem to bepreferably switch on in some tissues, while in other tissues there canbe no or only little activity. This phenomenon is known as leakyexpression. However, with specific expression in this invention is meantpreferable expression in one or a few plant tissues.

The “expression pattern” of a promoter (with or without enhancer) is thepattern of expression levels which shows where in the plant and in whatdevelopmental stage transcription is initiated by said promoter.Expression patterns of a set of promoters are said to be complementarywhen the expression pattern of one promoter shows little overlap withthe expression pattern of the other promoter. The level of expression ofa promoter can be determined by measuring the ‘steady state’concentration of a standard transcribed reporter mRNA. This measurementis indirect since the concentration of the reporter mRNA is dependentnot only on its synthesis rate, but also on the rate with which the mRNAis degraded. Therefore, the steady state level is the product ofsynthesis rates and degradation rates.

The rate of degradation can however be considered to proceed at a fixedrate when the transcribed sequences are identical, and thus this valuecan serve as a measure of synthesis rates. When promoters are comparedin this way techniques available to those skilled in the art arehybridization S1-RNAse analysis, northern blots and competitive RT-PCR.This list of techniques in no way represents all available techniques,but rather describes commonly used procedures used to analyzetranscription activity and expression levels of mRNA.

The analysis of transcription start points in practically all promotershas revealed that there is usually no single base at which transcriptionstarts, but rather a more or less clustered set of initiation sites,each of which accounts for some start points of the mRNA. Since thisdistribution varies from promoter to promoter the sequences of thereporter mRNA in each of the populations would differ from each other.Since each mRNA species is more or less prone to degradation, no singledegradation rate can be expected for different reporter mRNAs. It hasbeen shown for various eukaryotic promoter sequences that the sequencesurrounding the initiation site (‘initiator’) plays an important role indetermining the level of RNA expression directed by that specificpromoter. This includes also part of the transcribed sequences. Thedirect fusion of promoter to reporter sequences would therefore lead tosuboptimal levels of transcription.

A commonly used procedure to analyze expression patterns and levels isthrough determination of the ‘steady state’ level of proteinaccumulation in a cell. Commonly used candidates for the reporter gene,known to those skilled in the art are β-glucuronidase (GUS),chloramphenicol acetyl transferase (CAT) and proteins with fluorescentproperties, such as green fluorescent protein (GFP) from Aequoravictoria. In principle, however, many more proteins are suitable forthis purpose, provided the protein does not interfere with essentialplant functions. For quantification and determination of localization anumber of tools are suited. Detection systems can readily be created orare available which are based on, e.g., immunochemical, enzymatic,fluorescent detection and quantification. Protein levels can bedetermined in plant tissue extracts or in intact tissue using in situanalysis of protein expression.

Generally, individual transformed lines with one chimeric promoterreporter construct will vary in their levels of expression of thereporter gene. Also frequently observed is the phenomenon that suchtransformants do not express any detectable product (RNA or protein).The variability in expression is commonly ascribed to ‘positioneffects’, although the molecular mechanisms underlying this inactivityare usually not clear.

“Overexpression” refers to the level of expression in transgenic cellsor organisms that exceeds levels of expression in normal oruntransformed (nontransgenic) cells or organisms.

“Antisense inhibition” refers to the production of antisense RNAtranscripts capable of suppressing the expression of protein from anendogenous gene or a transgene.

“Gene silencing” refers to homology-dependent suppression of viralgenes, transgenes, or endogenous nuclear genes. Gene silencing may betranscriptional, when the suppression is due to decreased transcriptionof the affected genes, or post-transcriptional, when the suppression isdue to increased turnover (degradation) of RNA species homologous to theaffected genes (English et al., 1996). Gene silencing includesvirus-induced gene silencing (Ruiz et al. 1998).

The terms “heterologous DNA sequence,” “exogenous DNA segment” or“heterologous nucleic acid,” as used herein, each refer to a sequencethat originates from a source foreign to the particular host cell or, iffrom the same source, is modified from its original form. Thus, aheterologous gene in a host cell includes a gene that is endogenous tothe particular host cell but has been modified through, for example, theuse of DNA shuffling. The terms also include no-naturally occurringmultiple copies of a naturally occurring DNA sequence. Thus, the termsrefer to a DNA segment that is foreign or heterologous to the cell, orhomologous to the cell but in a position within the host cell nucleicacid in which the element is not ordinarily found. Exogenous DNAsegments are expressed to yield exogenous polypeptides. A “homologous”DNA sequence is a DNA sequence that is naturally associated with a hostcell into which it is introduced.

“Homologous to” in the context of nucleotide sequence identity refers tothe similarity between the nucleotide sequence of two nucleic acidmolecules or between the amino acid sequences of two protein molecules.As used herein, “homology” and “homologous” refer to an evaluation ofthe similarity between two sequences based on measurements of sequenceidentity adjusted for variables including gaps, insertions, frameshifts, conservative substitutions, and sequencing errors, as describedbelow. Two nucleotide sequences or polypeptides are the to be“identical” if the sequence of nucleotides or amino acid residues,respectively, in the two sequences is the same when aligned for maximumcorrespondence as described below. The term “complementary to” is usedherein to mean that the sequence can form a Watson-Crick base pair witha reference polynucleotide sequence. Complementary sequences can includenucleotides, such as inosine, that neither disrupt Watson-Crick basepairing nor contribute to the pairing. A “reverse complement” of asequence corresponds to the complementary sequence, but in the oppositeorientation of bases from 5′ to 3′, or to the complement of the primarysequence, if the primary sequence is in a reverse orientation of basesfrom 5′ to 3′.

Homology is evaluated using any of the variety of sequence comparisonalgorithms and programs known in the art. Such algorithms and programsinclude, but are by no means limited to, TBLASTN, BLASTP, FASTA, TFASTA,and CLUSTALW (Pearson and Lipman, Proc Natl Acad Sci (USA) 85:2444(1988); Altschul et al., J. Mol Biol 215:403 (1990)). In a particularlypreferred embodiment, protein and nucleic acid sequence homologies areevaluated using the Basic Local Aligment Search Tool (“BLAST”) which iswell known in the art (Karlin and Altschul, Proc Natl Acad Sci USA87:2264 (1990); Altschul et al. (1990) supra, Altschul et al., NucleicAcids Res 25:3389 (1997)). In particular, five specific BLAST programsare used to perform the following task:

-   -   (1) BLASTP and BLAST3 compare an amino acid query sequence        against a protein sequence database;    -   (2) BLASTN compares a nucleotide query sequence against a        nucleotide sequence database;    -   (3) BLASTX compares the six-frame conceptual translation        products of a query nucleotide sequence (both strands) against a        protein sequence database;    -   (4) TBLASTN compares a query protein sequence against a        nucleotide sequence database translated in all six reading        frames (both strands); and    -   (5) TBLASTX compares the six-frame translations of a nucleotide        query sequence against the six-frame translations of a        nucleotide sequence database.        The BLAST programs identify homologous sequences by identifying        similar segments, which are referred to herein as “high-scoring        segment pairs,” between a query amino or nucleic acid sequence        and a test sequence which is preferably obtained from a protein        or nucleic acid sequence database. High-scoring segment pairs        are preferably identified (aligned) by means of a scoring matrix        selected from the many scoring matrices known in the art.        Preferably, the scoring matrix used is the BLOSUM62 matrix        (Gonnet et al., Science 256:1443 (1992); Henikoff and Henikoff,        Proteins 17:49 (1993)). Likewise, the PAM or PAM250 matrices may        also be used (Schwartz and Dayhoff, In Atlas of protein Sequence        and Structure, Dayhoff, ed., Natl Biomed. Res. Found., pp.        353-358 (1978)). The BLAST programs evaluate the statistical        significance of all high-scoring segment pairs identified, and        preferably selects those segments which satisfy a user-specified        threshold of significance, such as a user-specified percent        homology. Preferably, the statistical significance of a        high-scoring segment pair is evaluated using the statistical        significance formula of Karlin (Karlin and Altschul (1990)        supra).

“Percentage of sequence identity” can be determined from alignmentsperformed using algorithms known in the art. Alignment of nucleotide orpolypeptide sequences for comparison may be conducted by the localhomology algorithm of Smith and Waterman (Add APL Math 2:482 (1981)), bythe homology alignment algorithm of Needleman and Wunsch (J. Mol Biol48:443 (1970)), by the search for similarity method of Pearson andLipman (Proc Natl Acad Sci USA 85:2444 (1988)), by computerizedimplementations of these algorithms (GAP, BESTFIT, BLAST, PASTA, andTFASTA in the Wisconsin Genetics Software Package, Genetics ComputerGroup), or by inspection. When two sequences have been identified forcomparison, GAP and BESTFIT are preferably employed to determine theiroptimal alignment. Typically, the default values of 5.00 for gap weightand 0.30 for gap weight length are used. In a preferred embodiment,percenty identity is determined using the GAP program for globalalignment using default parameters, using the version of GAP found inthe GCG package (Wisconsin Package Version 10.1, Genetics ComputerGroup, 575 Science Dr., Madison, Wis.).

“Percentage of sequence identity” is determined by comparing twooptimally aligned sequences over a comparison window, wherein theportion of the sequence in the comparison window may include additionsor deletions, including for example gaps or overhangs, as compared tothe reference sequence (which does not include additions or deletions)for optimal alignment of the two sequences. The percentage is calculatedby determining the number of positions at which the identical nucleotidebase or amino acid residue occurs in both sequences to yield the numberof matched positions, dividing the number of matched positions by thetotal number of positions in the window of comparison and multiplyingthe result by 100 to yield the percentage of sequence identity.

In a broad sense, the term “substantially similar”, when used hereinwith respect to a nucleotide sequence, means a nucleotide sequencecorresponding to a reference nucleotide sequence, wherein thecorresponding sequence encodes a polypeptide having substantially thesame structure as the polypeptide encoded by the reference nucleotidesequence. Desirably, the substantially similar nucleotide sequenceencodes the polypeptide encoded by the reference nucleotide sequence.Preferably, “substantially similar” refers to nucleotide sequenceshaving at least 50% sequence identity, preferably at least 60%, 70%, 80%or 85%, more preferably at least 90% or 95%, and even more preferably,at least 96%, 97% or 99% sequence identity compared to a referencesequence containing nucleotide sequences of Table 1, that encode aprotein having at least 50% identity, more preferably at least 85%identity, yet still more preferably at least 90% identity to a region ofsequence of a BIOPATH protein and/or an FPD, wherein the proteinsequence comparisons are conducted using GAP analysis as describedbelow. Also, “substantially similar” preferably also refers tonucleotide sequences having at least 50% identity, more preferably atleast 80% identity, still more preferably 95% identity, yet still morepreferably at least 99% identity, to a region of nucleotide sequenceencoding a BIOPATH protein and/or an FPD, wherein the nucleotidesequence comparisons are conducted using GAP analysis as describedbelow. The term “substantially similar” is specifically intended toinclude nucleotide sequences wherein the sequence has been modified tooptimize expression in particular cells.

A polynucleotide including a nucleotide sequence “substantially similar”to the reference nucleotide sequence preferably hybridizes to apolynucleotide including the reference nucleotide sequence in 7% sodiumdodecyl sulfate (SDS), 0.5 M NaPO₄, 1 mM EDTA at 50° C. with washing in2×SSC, 0.1% SDS at 50° C., more desirably in 7% sodium dodecyl sulfate(SDS), 0.5 M NaPO₄, 1 mM EDTA at 50° C. with washing in 1×SSC, 0.1% SDSat 50° C., more desirably still in 7% sodium dodecyl sulfate (SDS), 0.5M NaPO₄, 1 mM EDTA at 50° C. with washing in 0.5×SSC, 0.1% SDS at 50°C., preferably in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO₄, 1 mMEDTA at 50° C. with washing in 0.1×SSC, 0.1% SDS at 50° C., morepreferably in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO₄, 1 mM EDTA at50° C. with washing in 0.1×SSC, 0.1% SDS at 65° C.

The term “substantially similar”, when used herein with respect to aprotein or polypeptide, means a protein or polypeptide corresponding toa reference protein, wherein the protein has substantially the samestructure and function as the reference protein, where only changes inamino acids sequence that do not materially affect the polypeptidefunction occur. When used for a protein or an amino acid sequence thepercentage of identity between the substantially similar and thereference protein or amino acid sequence desirably is preferably atleast 30%, more preferably at least 40%, 50%, 60%, 70%, 80%, 85%, or90%, still more preferably at least 95%, still more preferably at least99% with every individual number falling within this range of at least30% to at least 99% also being part of the invention, using default GAPanalysis parameters with the University of Wisconsin GCG (version 10),SEQWEB application of GAP, based on the algorithm of Needleman andWunsch (1970), supra. As used herein the term “polypeptide of thepresent invention,” or any similar term refers to an amino acid sequenceencoded by a DNA molecule including a nucleotide sequence substantiallysimilar to an AC sequence. Homologs of BIOPATH protein and/or FPDsinclude amino acid sequences that are at least 30% identical to BIOPATHprotein and/or FPD sequences found in searchable databases, as measuredusing the parameters described above.

“Target gene” refers to a gene on the replicon that expresses thedesired target coding sequence, functional RNA, or protein. The targetgene is not essential for replicon replication. Additionally, targetgenes may comprise native non-viral genes inserted into a non-nativeorganism, or chimeric genes, and will be under the control of suitableregulatory sequences. Thus, the regulatory sequences in the target genemay come from any source, including the virus. Target genes may includecoding sequences that are either heterologous or homologous to the genesof a particular plant to be transformed. However, target genes do notinclude native viral genes. Typical target genes include, but are notlimited to genes encoding a structural protein, a seed storage protein,a protein that conveys herbicide resistance, and a protein that conveysinsect resistance. Proteins encoded by target genes are known as“foreign proteins”. The expression of a target gene in a plant willtypically produce an altered plant trait.

The term “altered plant trait” means any phenotypic or genotypic changein a transgenic plant relative to the wild-type or nor-transgenic planthost.

“Chromosomally-integrated” refers to the integration of a foreign geneor DNA construct into the host DNA by covalent bonds. Where genes arenot “chromosomally integrated” they may be “transiently expressed.”Transient expression of a gene refers to the expression of a gene thatis not integrated into the host chromosome but functions independently,either as part of an autonomously replicating plasmid or expressioncassette, for example, or as part of another biological system such as avirus.

The term “transformation” refers to the transfer of a nucleic acidfragment into the genome of a host cell, resulting in genetically stableinheritance. Host cells containing the transformed nucleic acidfragments are referred to as “trausgenic” cells, and organismscomprising transgenic cells are referred to as “transgenic organisms”.Examples of methods of transformation of plants and plant cells includeAgrobacterium-mediated transformation (De Blaere et al., 1987) andparticle bombardment technology (Klein et al. 1987; U.S. Pat. No.4,945,050). Whole plants may be regenerated from transgenic cells bymethods well known to the skilled artisan (see, for example, Fromm etal., 1990).

“Transformed,” “transgenic,” and “recombinant” refer to a host organismsuch as a bacterium or a plant into which a heterologous nucleic acidmolecule has been introduced. The nucleic acid molecule can be stablyintegrated into the genome generally known in the art and are disclosedin Sambrook et al., 1989. See also Innis et al., 1995 and Gelfand, 1995;and Innis and Gelfand, 1999. Known methods of PCR include, but are notlimited to, methods using paired primers, nested primers, singlespecific primers, degenerate primers, gene-specific primers,vector-specific primers, partially mismatched primers, and the like. Forexample, “transformed,” “transformant,” and “transgenic” plants or callihave been through the transformation process and contain a foreign geneintegrated into their chromosome. The term “untransformed” refers tonormal plants that have not been through the transformation process.

“Transiently transformed” refers to cells in which transgenes andforeign DNA have been introduced (for example, by such methods asAgrobacterium-mediated transformation or biolistic bombardment), but notselected for stable maintenance.

“Stably transformed” refers to cells that have been selected andregenerated on a selection media following transformation.

“Transient expression” refers to expression in cells in which a virus ora transgene is introduced by viral infection or by such methods asAgrobacterium-mediated transformation, electroporation, or biolisticbombardment, but not selected for its stable maintenance.

“Genetically stable” and “heritable” refer to chromosomally-integratedgenetic elements that are stably maintained in the plant and stablyinherited by progeny through successive generations.

“Primary transformant” and “T0 generation” refer to transgenic plantsthat are of the same genetic generation as the tissue which wasinitially transformed (i.e., not having gone through meiosis andfertilization since transformation).

“Secondary transformants” and the “T1, T2, T3, etc. generations” referto transgenic plants derived from primary transformants through one ormore meiotic and fertilization cycles. They may be derived byself-fertilization of primary or secondary transformants or crosses ofprimary or secondary transformants with other transformed oruntransformed plants.

“Wild-type” refers to a virus or organism found in nature without anyknown mutation.

“Genome” refers to the complete genetic material of an organism.

The term “nucleic acid” refers to deoxyribonucleotides orribonucleotides and polymers thereof in either single- ordouble-stranded form, composed of monomers (nucleotides) containing asugar, phosphate and a base which is either a purine or pyrimidine.Unless specifically limited, the term encompasses nucleic acidscontaining known analogs of natural nucleotides which have similarbinding properties as the reference nucleic acid and are metabolized ina manner similar to naturally occurring nucleotides. Unless otherwiseindicated, a particular nucleic acid sequence also implicitlyencompasses conservatively modified variants thereof (e.g., degeneratecodon substitutions) and complementary sequences as well as the sequenceexplicitly indicated. Specifically, degenerate codon substitutions maybe achieved by generating sequences in which the third position of oneor more selected (or all) codons is substituted with mixed-base and/ordeoxyinosine residues (Batzer et al., 1991; Ohtsuka et al., 1985;Rossolini et al. 1994). A “nucleic acid fragment” is a fraction of agiven nucleic acid molecule. In higher plants, deoxyribonucleic acid(DNA) is the genetic material while ribonucleic acid (RNA) is involvedin the transfer of information contained within DNA into proteins. Theterm “nucleotide sequence” refers to a polymer of DNA or RNA which canbe single- or double-stranded, optionally containing synthetic,non-natural or altered nucleotide bases capable of incorporation intoDNA or RNA polymers. The terms “nucleic acid” or “nucleic acid sequence”may also be used interchangeably with gene, cDNA, DNA and RNA encoded bya gene.

The invention encompasses isolated or substantially purified nucleicacid or protein compositions. In the context of the present invention,an “isolated” or “purified” DNA molecule or an “isolated” or “purified”polypeptide is a DNA molecule or polypeptide that, by the hand of man,exists apart from its native environment and is therefore not a productof nature. An isolated DNA molecule or polypeptide may exist in apurified form or may exist in a non native environment such as, forexample, a transgenic host cell. For example, an “isolated” or“purified” nucleic acid molecule or protein, or biologically activeportion thereof, is substantially free of other cellular material, orculture medium when produced by recombinant techniques, or substantiallyfree of chemical precursors or other chemicals when chemicallysynthesized. Preferably, an “isolated” nucleic acid is free of sequences(preferably protein encoding sequences) that naturally flank the nucleicacid (i.e., sequences located at the 5′ and 3′ ends of the nucleic acid)in the genomic DNA of the organism from which the nucleic acid isderived. For example, in various embodiments, the isolated nucleic acidmolecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5kb, or 0.1 kb of nucleotide sequences that naturally flank the nucleicacid molecule in genomic DNA of the cell from which the nucleic acid isderived. A protein that is substantially free of cellular materialincludes preparations of protein or polypeptide having less than about30%, 20%, 10%, 5%, (by dry weight) of contaminating protein. When theprotein of the invention, or biologically active portion thereof, isrecombinantly produced, preferably culture medium represents less thanabout 30%, 20%, 10%, or 5% (by dry weight) of chemical precursors ornon-protein of interest chemicals.

The nucleotide sequences of the invention include both the naturallyoccurring sequences as well as mutant (variant) forms. Such variantswill continue to possess the desired activity, i.e., either promoteractivity or the activity of the product encoded by the open readingframe of the non-variant nucleotide sequence.

Thus, by “variants” is intended substantially similar sequences. Fornucleotide sequences comprising an open reading frame, variants includethose sequences that, because of the degeneracy of the genetic code,encode the identical amino acid sequence of the native protein.Naturally occurring allelic variants such as these can be identifiedwith the use of well-known molecular biology techniques, as, forexample, with polymerase chain reaction (PCR) and hybridizationtechniques. Variant nucleotide sequences also include syntheticallyderived nucleotide sequences, such as those generated, for example, byusing site-directed mutagenesis and for open reading frames, encode thenative protein, as well as those that encode a polypeptide having aminoacid substitutions relative to the native protein. Generally, nucleotidesequence variants of the invention will have at least 40, 50, 60, to70%, e.g., preferably 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, to 79%,generally at least 80%, e.g., 81%-84%, at least 85%, e.g., 86%, 87%,88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, to 98% and 99%nucleotide sequence identity to the native (wild type or endogenous)nucleotide sequence.

“Conservatively modified variations” of a particular nucleic acidsequence refers to those nucleic acid sequences that encode identical oressentially identical amino acid sequences, or where the nucleic acidsequence does not encode an amino acid sequence, to essentiallyidentical sequences. Because of the degeneracy of the genetic code, alarge number of functionally identical nucleic acids encode any givenpolypeptide. For instance the codons CGT, CGC, CGA, CGG, AGA, and AGGall encode the amino acid arginine. Thus, at every position where anarginine is specified by a codon, the codon can be altered to any of thecorresponding codons described without altering the encoded protein.Such nucleic acid variations are “silent variations” which are onespecies of “conservatively modified variations.” Every nucleic acidsequence described herein which encodes a polypeptide also describesevery possible silent variation, except where otherwise noted. One ofskill will recognize that each codon in a nucleic acid (except ATG,which is ordinarily the only codon for methionine) can be modified toyield a functionally identical molecule by standard techniques.Accordingly, each “silent variation” of a nucleic acid which encodes apolypeptide is implicit in each described sequence.

The nucleic acid molecules of the invention can be “optimized” forenhanced expression in plants of interest. See, for example, EPA 035472;WO 91/16432; Perlak et al., 1991; and Murray et al., 1989. In thismanner, the open reading frames in genes or gene fragments can besynthesized utilizing plant-preferred codons. See, for example, Campbelland Gowri, 1990 for a discussion of host-preferred codon usage. Thus,the nucleotide sequences can be optimized for expression in any plant.It is recognized that all or any part of the gene sequence may beoptimized or synthetic. That is, synthetic or partially optimizedsequences may also be used. Variant nucleotide sequences and proteinsalso encompass sequences and protein derived from a mutagenic andrecombinogenic procedure such as DNA shuffling. With such a procedure,one or more different coding sequences can be manipulated to create anew polypeptide possessing the desired properties. In this manner,libraries of recombinant polynucleotides are generated from a populationof related sequence polynucleotides comprising sequence regions thathave substantial sequence identity and can be homologously recombined invitro or in vivo. Strategies for such DNA shuffling are known in theart. See, for example, Stemmer, 1994; Stemmer, 1994; Crameri et al.,1997; Moore et al., 1997; Zhang et al., 1997; Crameri et al., 1998; andU.S. Pat. Nos. 5,605,793 and 5,837,458.

By “variant” polypeptide is intended a polypeptide derived from thenative protein by deletion (so-called truncation) or addition of one ormore amino acids to the N-terminal and/or C-terminal end of the nativeprotein; deletion or addition of one or more amino acids at one or moresites in the native protein; or substitution of one or more amino acidsat one or more sites in the native protein. Such variants may resultfrom, for example, genetic polymorphism or from human manipulation.Methods for such manipulations are generally known in the art.

Thus, the polypeptides may be altered in various ways including aminoacid substitutions, deletions, truncations, and insertions. Methods forsuch manipulations are generally known in the art. For example, aminoacid sequence variants of the polypeptides can be prepared by mutationsin the DNA. Methods for mutagenesis and nucleotide sequence alterationsare well known in the art. See, for example, Kunkel, 1985; Kunkel etal., 1987; U.S. Pat. No. 4,873,192; Walker and Gaastra, 1983 and thereferences cited therein. Guidance as to appropriate amino acidsubstitutions that do not affect biological activity of the protein ofinterest may be found in the model of Dayhoffet al. (1978). Conservativesubstitutions, such as exchanging one amino acid with another havingsimilar properties, are preferred.

Individual substitutions deletions or additions that alter, add ordelete a single amino acid or a small percentage of amino acids(typically less than 5%, more typically less than 1%) in an encodedsequence are “conservatively modified variations,” where the alterationsresult in the substitution of an amino acid with a chemically similaramino acid. Conservative substitution tables providing functionallysimilar amino acids are well known in the art. The following five groupseach contain amino acids that are conservative substitutions for oneanother: Aliphatic: Glycine (G), Alanine (A), Valine (V), Leucine (L),Isoleucine (I); Aromatic: Phenylalanine (F), Tyrosine (Y), Tryptophan(W); Sulfur-containing: Methionine (M), Cysteine (C); Basic: Arginine I,Lysine (K), Histidine (H); Acidic: Aspartic acid (D), Glutamic acid (E),Asparagine (N), Glutamine (Q). See also, Creighton, 1984. In addition,individual substitutions, deletions or additions which alter, add ordelete a single amino acid or a small percentage of amino acids in anencoded sequence are also “conservatively modified variations.”

“Expression cassette” as used herein means a DNA sequence capable ofdirecting expression of a particular nucleotide sequence in anappropriate host cell, comprising a promoter operably linked to thenucleotide sequence of interest which is operably linked to terminationsignals. It also typically comprises sequences required for propertranslation of the nucleotide sequence. The coding region usually codesfor a protein of interest but may also code for a functional RNA ofinterest, for example antisense RNA or a nontranslated RNA, in the senseor antisense direction. The expression cassette comprising thenucleotide sequence of interest may be chimeric, meaning that at leastone of its components is heterologous with respect to at least one ofits other components. The expression cassette may also be one which isnaturally occurring but has been obtained in a recombinant form usefulfor heterologous expression. The expression of the nucleotide sequencein the expression cassette may be under the control of a constitutivepromoter or of an inducible promoter which initiates transcription onlywhen the host cell is exposed to some particular external stimulus. Inthe case of a multicellular organism, the promoter can also be specificto a particular tissue or organ or stage of development.

“Vector” is defined to include, inter alia, any plasmid, cosmid, phageor Agrobacterium binary vector in double or single stranded linear orcircular form which may or may not be self transmissible or mobilizable,and which can transform prokaryotic or eukaryotic host either byintegration into the cellular genome or exist extrachromosomally (e.g.autonomous replicating plasmid with an origin of replication).

Specifically included are shuttle vectors by which is meant a DNAvehicle capable, naturally or by design, of replication in two differenthost organisms, which may be selected from actinomycetes and relatedspecies, bacteria and eukaryotic (e.g. higher plant, mammalian, yeast orfungal cells).

Preferably the nucleic acid in the vector is under the control of, andoperably linked to, an appropriate promoter or other regulatory elementsfor transcription in a host cell such as a microbial, e.g. bacterial, orplant cell. The vector may be a bifunctional expression vector whichfunctions in multiple hosts. In the case of genomic DNA, this maycontain its own promoter or other regulatory elements and in the case ofcDNA this may be under the control of an appropriate promoter or otherregulatory elements for expression in the host cell.

“Cloning vectors” typically contain one or a small number of restrictionendonuclease recognition sites at which foreign DNA sequences can beinserted in a determinable fashion without loss of essential biologicalfunction of the vector, as well as a marker gene that is suitable foruse in the identification and selection of cells transformed with thecloning vector. Marker genes typically include genes that providetetracycline resistance, hygromycin resistance or ampicillin resistance.

A “transgenic plant” is a plant having one or more plant cells thatcontain an expression vector.

“Plant tissue” includes differentiated and undifferentiated tissues orplants, including but not limited to roots, stems, shoots, leaves,pollen, seeds, tumor tissue and various forms of cells and culture suchas single cells, protoplast, embryos, and callus tissue. The planttissue may be in plants or in organ, tissue or cell culture.

The following terms are used to describe the sequence relationshipsbetween two or more nucleic acids or polynucleotides: (a) “referencesequence”, (b) “comparison window”, (c) “sequence identity”, (d)“percentage of sequence identity”, and (e) “substantial identity”.

(a) As used herein, “reference sequence” is a defined sequence used as abasis for sequence comparison. A reference sequence may be a subset orthe entirety of a specified sequence; for example, as a segment of afull length cDNA or gene sequence, or the complete cDNA or genesequence.

(b) As used herein, “comparison window” makes reference to a contiguousand specified segment of a polynucleotide sequence, wherein thepolynucleotide sequence in the comparison window may comprise additionsor deletions (i.e., gaps) compared to the reference sequence (which doesnot comprise additions or deletions) for optimal alignment of the twosequences. Generally, the comparison window is at least 20 contiguousnucleotides in length, and optionally can be 30, 40, 50, 100, or longer.Those of skill in the art understand that to avoid a high similarity toa reference sequence due to inclusion of gaps in the polynucleotidesequence a gap penalty is typically introduced and is subtracted fromthe number of matches.

Methods of alignment of sequences for comparison are well known in theart. Thus, the determination of percent identity between any twosequences can be accomplished using a mathematical algorithm. Preferred,nonlimiting examples of such mathematical algorithms are the algorithmof Myers and Miller, 1988; the local homology algorithm of Smith et al.1981; the homology alignment algorithm of Needleman and Wunsch 1970; thesearch for-similarity-method of Pearson and Lipman 1988; the algorithmof Karlin and Altschul, 1990, modified as in Karlin and Altschul, 1993.

Computer implementations of these mathematical algorithms can beutilized for comparison of sequences to determine sequence identity.Such implementations include, but are not limited to: CLUSTAL in thePC/Gene program (available from Intelligenetics, Mountain View, Calif.);the ALIGN program (Version 2.0) and GAP, BESTFIT, BLAST, FASTA, andTFASTA in the Wisconsin Genetics Software Package, Version 8 (availablefrom Genetics Computer Group (GCG), 575 Science Drive, Madison, Wis.,USA). Alignments using these programs can be performed using the defaultparameters. The CLUSTAL program is well described by Higgins et al.1988; Higgins et al. 1989; Corpet et al. 1988; Huang et al. 1992; andPearson et al. 1994. The ALIGN program is based on the algorithm ofMyers and Miller, supra. The BLAST programs of Altschul et al., 1990,are based on the algorithm of Karlin and Altschul supra.

Software for performing BLAST analyses is publicly available through theNational Center for Biotechnology Information(http://www.ncbi.nlm.nih.gov/). This algorithm involves firstidentifying high scoring sequence pairs (HSPs) by identifying shortwords of length W in the query sequence, which either match or satisfysome positive-valued threshold score T when aligned with a word of thesame length in a database sequence. T is referred to as the neighborhoodword score threshold (Altschul et al., 1990). These initial neighborhoodword hits act as seeds for initiating searches to find longer HSPscontaining them. The word hits are then extended in both directionsalong each sequence for as far as the cumulative alignment score can beincreased. Cumulative scores are calculated using, for nucleotidesequences, the parameters M (reward score for a pair of matchingresidues; always >0) and N (penalty score for mismatching residues;always <0). For amino acid sequences, a scoring matrix is used tocalculate the cumulative score. Extension of the word hits in eachdirection are halted when the cumulative alignment score falls off bythe quantity X from its maximum achieved value, the cumulative scoregoes to zero or below due to the accumulation of one or morenegative-scoring residue alignments, or the end of either sequence isreached.

In addition to calculating percent sequence identity, the BLASTalgorithm also performs a statistical analysis of the similarity betweentwo sequences (see, e.g., Karlin & Altschul (1993). One measure ofsimilarity provided by the BLAST algorithm is the smallest sumprobability (P(N)), which provides an indication of the probability bywhich a match between two nucleotide or amino acid sequences would occurby chance. For example, a test nucleic acid sequence is consideredsimilar to a reference sequence if the smallest sum probability in acomparison of the test nucleic acid sequence to the reference nucleicacid sequence is less than about 0.1, more preferably less than about0.01, and most preferably less than about 0.001.

To obtain gapped alignments for comparison purposes, Gapped BLAST (inBLAST 2.0) can be utilized as described in Altschul et al. 1997.Alternatively, PSI-BLAST (in BLAST 2.0) can be used to perform aniterated search that detects distant relationships between molecules.See Altschul et al., supra. When utilizing BLAST, Gapped BLAST,PSI-BLAST, the default parameters of the respective programs (e.g.BLASTN for nucleotide sequences, BLASTX for proteins) can be used. TheBLASTN program (for nucleotide sequences) uses as defaults a wordlength(W) of 11, an expectation (E) of 10, a cutoff of 100, M=5, Nc=−4, and acomparison of both strands. For amino acid sequences, the BLASTP programuses as defaults a wordlength (W) of 3, an expectation (E) of 10, andthe BLOSUM62 scoring matrix (see Henikoff & Henikoff, 1989). Seehttp://www.ncbi.nlm.nih.gov. Alignment may also be performed manually byinspection.

For purposes of the present invention, comparison of nucleotidesequences for determination of percent sequence identity to the promotersequences disclosed herein is preferably made using the BlastN program(version 1.4.7 or later) with its default parameters or any equivalentprogram. By “equivalent program” is intended any sequence comparisonprogram that, for any two sequences in question, generates an alignmenthaving identical nucleotide or amino acid residue matches and anidentical percent sequence identity when compared to the correspondingalignment generated by the preferred program.

(c) As used herein, “sequence identity” or “identity” in the context oftwo nucleic acid or polypeptide sequences makes reference to theresidues in the two sequences that are the same when aligned for maximumcorrespondence over a specified comparison window. When percentage ofsequence identity is used in reference to proteins it is recognized thatresidue positions which are not identical often differ by conservativeamino acid substitutions, where amino acid residues are substituted forother amino acid residues with similar chemical properties (e.g., chargeor hydrophobicity) and therefore do not change the functional propertiesof the molecule. When sequences differ in conservative substitutions,the percent sequence identity may be adjusted upwards to correct for theconservative nature of the substitution. Sequences that differ by suchconservative substitutions are said to have “sequence similarity” or“similarity.” Means for making this adjustment are well known to thoseof skill in the art. Typically this involves scoring a conservativesubstitution as a partial rather than a full mismatch, therebyincreasing the percentage sequence identity. Thus, for example, where anidentical amino acid is given a score of 1 and a non-conservativesubstitution is given a score of zero, a conservative substitution isgiven a score between zero and 1. The scoring of conservativesubstitutions is calculated, e.g., as implemented in the program PC/GENE(Intelligenetics, Mountain View, Calif.).

(d) As used herein, “percentage of sequence identity” means the valuedetermined by comparing two optimally aligned sequences over acomparison window, wherein the portion of the polynucleotide sequence inthe comparison window may comprise additions or deletions (i.e., gaps)as compared to the reference sequence (which does not comprise additionsor deletions) for optimal alignment of the two sequences. The percentageis calculated by determining the number of positions at which theidentical nucleic acid base or amino acid residue occurs in bothsequences to yield the number of matched positions, dividing the numberof matched positions by the total number of positions in the window ofcomparison, and multiplying the result by 100 to yield the percentage ofsequence identity.

(e)(i) The term “substantial identity” of polynucleotide sequences meansthat a polynucleotide comprises a sequence that has at least 70%, 71%,72%, 73%, 74%, 75%, 76%, 77%, 78%, or 79%, preferably at least 80%, 81%,82%, 83%, 84%, 85%, 86%, 87%, 88%, or 89%, more preferably at least 90%,91%, 92%, 93%, or 94%, and most preferably at least 95%, 96%, 97%, 98%,or 99% sequence identity, compared to a reference sequence using one ofthe alignment programs described using standard parameters. One of skillin the art will recognize that these values can be appropriatelyadjusted to determine corresponding identity of proteins encoded by twonucleotide sequences by taking into account codon degeneracy, amino acidsimilarity, reading frame positioning, and the like. Substantialidentity of amino acid sequences for these purposes normally meanssequence identity of at least 70%, more preferably at least 80%, 90%,and most preferably at least 95%.

Another indication that nucleotide sequences are substantially identicalis if two molecules hybridize to each other under stringent conditions(see below). Generally, stringent conditions are selected to be about 5°C. lower than the thermal melting point (T_(m)) for the specificsequence at a defined ionic strength and pH. However, stringentconditions encompass temperatures in the range of about 1° C. to about20° C., depending upon the desired degree of stringency as otherwisequalified herein. Nucleic acids that do not hybridize to each otherunder stringent conditions are still substantially identical if thepolypeptides they encode are substantially identical. This may occur,e.g., when a copy of a nucleic acid is created using the maximum codondegeneracy permitted by the genetic code. One indication that twonucleic acid sequences are substantially identical is when thepolypeptide encoded by the first nucleic acid is immunologically crossreactive with the polypeptide encoded by the second nucleic acid.

(e)(ii) The term “substantial identity” in the context of a peptideindicates that a peptide comprises a sequence with at least 70%, 71%,72%, 73%, 74%, 75%, 76%, 77%, 78%, or 79%, preferably 80%, 81%, 82%,83%, 84%, 85%, 86%, 87%, 88%, or 89%, more preferably at least 90%, 91%,92%, 93%, or 94%, or even more preferably, 95%, 96%, 97%, 98% or 99%,sequence identity to the reference sequence over a specified comparisonwindow. Preferably, optimal alignment is conducted using the homologyalignment algorithm of Needleman and Wunsch (1970). An indication thattwo peptide sequences are substantially identical is that one peptide isimmunologically reactive with antibodies raised against the secondpeptide. Thus, a peptide is substantially identical to a second peptide,for example, where the two peptides differ only by a conservativesubstitution.

For sequence comparison, typically one sequence acts as a referencesequence to which test sequences are compared. When using a sequencecomparison algorithm, test and reference sequences are input into acomputer, subsequence coordinates are designated if necessary, andsequence algorithm program parameters are designated. The sequencecomparison algorithm then calculates the percent sequence identity forthe test sequence(s) relative to the reference sequence, based on thedesignated program parameters.

As noted above, another indication that two nucleic acid sequences aresubstantially identical is that the two molecules hybridize to eachother under stringent conditions. The phrase “hybridizing specificallyto” refers to the binding, duplexing, or hybridizing of a molecule onlyto a particular nucleotide sequence under stringent conditions when thatsequence is present in a complex mixture (e.g., total cellular) DNA orRNA. “Bind(s) substantially” refers to complementary hybridizationbetween a probe nucleic acid and a target nucleic acid and embracesminor mismatches that can be accommodated by reducing the stringency ofthe hybridization media to achieve the desired detection of the targetnucleic acid sequence.

“Stringent hybridization conditions” and “stringent hybridization washconditions” in the context of nucleic acid hybridization experimentssuch as Southern and Northern hybridization are sequence dependent, andare different under different environmental parameters. The T_(m) is thetemperature (under defined ionic strength and pH) at which 50% of thetarget sequence hybridizes to a perfectly matched probe. Specificity istypically the function of post-hybridization washes, the criticalfactors being the ionic strength and temperature of the final washsolution. For DNA-DNA hybrids, the T_(m) can be approximated from theequation of Meinkoth and Wahl, 1984; T_(m) 81.5° C.+16.6 (log M)+0.41 (%GC)−0.61 (% form)−500/L; where M is the molarity of monovalent cations,% GC is the percentage of guanosine and cytosine nucleotides in the DNA,% form is the percentage of formamide in the hybridization solution, andL is the length of the hybrid in base pairs. T_(m) is reduced by about1° C. for each 1% of mismatching; thus, T_(m), hybridization, and/orwash conditions can be adjusted to hybridize to sequences of the desiredidentity. For example, if sequences with >90% identity are sought, theT_(m) can be decreased 10° C. Generally, stringent conditions areselected to be about 5° C. lower than the thermal melting point I forthe specific sequence and its complement at a defined ionic strength andpH. However, severely stringent conditions can utilize a hybridizationand/or wash at 1, 2, 3, or 4° C. lower than the thermal melting point I;moderately stringent conditions can utilize a hybridization and/or washat 6, 7, 8, 9, or 10° C. lower than the thermal melting point I; lowstringency conditions can utilize a hybridization and/or wash at 11, 12,13, 14, 15, or 20° C. lower than the thermal melting point 1. Using theequation, hybridization and wash compositions, and desired T, those ofordinary skill will understand that variations in the stringency ofhybridization and/or wash solutions are inherently described. If thedesired degree of mismatching results in a T of less than 45° C.(aqueous solution) or 32° C. (formamide solution), it is preferred toincrease the SSC concentration so that a higher temperature can be used.An extensive guide to the hybridization of nucleic acids is found inTijssen, 1993. Generally, highly stringent hybridization and washconditions are selected to be about 5° C. lower than the thermal meltingpoint T_(m) for the specific sequence at a defined ionic strength andpH.

An example of highly stringent wash conditions is 0.15 M NaCl at 72° C.for about 15 minutes. An example of stringent wash conditions is a0.2×SSC wash at 65° C. for 15 minutes (see, Sambrook, infra, for adescription of SSC buffer). Often, a high stringency wash is preceded bya low stringency wash to remove background probe signal. An examplemedium stringency wash for a duplex of, e.g., more than 100 nucleotides,is 1×SSC at 45° C. for 15 minutes. An example low stringency wash for aduplex of, e.g., more than 100 nucleotides, is 4-6×SSC at 40° C. for 15minutes. For short probes (e.g., about 10 to 50 nucleotides), stringentconditions typically involve salt concentrations of less than about 1.5M, more preferably about 0.01 to 1.0 M, Na ion concentration (or othersalts) at pH 7.0 to 8.3, and the temperature is typically at least about30° C. and at least about 60° C. for long robes (e.g., >50 nucleotides).Stringent conditions may also be achieved with the addition ofdestabilizing agents such as formamide. In general, a signal to noiseratio of 2× (or higher) than that observed for an unrelated probe in theparticular hybridization assay indicates detection of a specifichybridization. Nucleic acids that do not hybridize to each other understringent conditions are still substantially identical if the proteinsthat they encode are substantially identical. This occurs, e.g., when acopy of a nucleic acid is created using the maximum codon degeneracypermitted by the genetic code.

Very stringent conditions are selected to be equal to the T_(m) for aparticular probe. An example of stringent conditions for hybridizationof complementary nucleic acids which have more than 100 complementaryresidues on a filter in a Southern or Northern blot is 50% formamide,e.g., hybridization in 50% formamide, 1 M NaCl, 1% SDS at 37° C., and awash in 0.1×SSC at 60 to 65° C. Exemplary low stringency conditionsinclude hybridization with a buffer solution of 30 to 35% formamide, 1 MNaCl, 1% SDS (sodium dodecyl sulphate) at 37° C., and a wash in 1× to2×SSC (20×SSC=3.0 M NaCl/0.3 M trisodium citrate) at 50 to 55° C.Exemplary moderate stringency conditions include hybridization in 40 to45% formamide, 1.0 M NaCl, 1% SDS at 37° C., and a wash in 0.5× to 1×SSCat 55 to 60° C.

The following are examples of sets of hybridization/wash conditions thatmay be used to clone orthologous nucleotide sequences that aresubstantially identical to reference nucleotide sequences of the presentinvention: a reference nucleotide sequence preferably hybridizes to thereference nucleotide sequence in 7% sodium dodecyl sulfate (SDS), 0.5 MNaPO₄, 1 mM EDTA at 50° C. with washing in 2×SSC, 0.1% SDS at 50° C.,more desirably in 7% A sodium dodecyl sulfate (SDS), 0.5 M NaPO₄, 1 mMEDTA at 50° C. with washing in 1×SSC, 0.1% SDS at 50° C., more desirablystill in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO₄, 1 mM EDTA at 50°C. with washing in 0.5×SSC, 0.1% SDS at 50° C., preferably in 7% sodiumdodecyl sulfate (SDS), 0.5 M NaPO₄, 1 mM EDTA at 50° C. with washing in0.1×SSC, 0.1% SDS at 50° C., more preferably in 7% sodium dodecylsulfate (SDS), 0.5 M NaPO₄, 1 mM EDTA at 50° C. with washing in 0.1×SSC,0.1% SDS at 65° C.

“DNA shuffling” is a method to introduce mutations or rearrangements,preferably randomly, in a DNA molecule or to generate exchanges of DNAsequences between two or more DNA molecules, preferably randomly. TheDNA molecule resulting from DNA shuffling is a shuffled DNA moleculethat is a non-naturally occurring DNA molecule derived from at least onetemplate DNA molecule. The shuffled DNA preferably encodes a variantpolypeptide modified with respect to the polypeptide encoded by thetemplate DNA, and may have an altered biological activity with respectto the polypeptide encoded by the template DNA.

“Recombinant DNA molecule” is a combination of DNA sequences that arejoined together using recombinant DNA technology and procedures used tojoin together DNA sequences as described, for example, in Sambrook etal., 1989.

The word “plant” refers to any plant, particularly to seed plant, and“plant cell” is a structural and physiological unit of the plant, whichcomprises a cell wall but may also refer to a protoplast. The plant cellmay be in form of an isolated single cell or a cultured cell, or as apart of higher organized unit such as, for example, a plant tissue, or aplant organ.

“Significant increase” is an increase that is larger than the margin oferror inherent in the measurement technique, preferably an increase byabout 2-fold or greater.

“Significantly less” means that the decrease is larger than the marginof error inherent in the measurement technique, preferably a decrease byabout 2-fold or greater.

Within the scope of the present invention a set of nucleic acidmolecules is provided which comprises polynucleotides relating to geneswhich are shown to be preferentially up-regulated and to share a similarexpression pattern during the process of grain filling. Thepolynucleotides within this subgroup are useful tools for generatingplants which produce grain with modified compositional characteristicsleading to improved nutritional properties.

In one embodiment, the present invention thus relates to an isolatednucleic acid molecule comprising a nucleotide sequence encoding apolypeptide the expression of which is up-regulated during grain fillingand the use of said molecule for modifying the nutritional compositionand quality of the plant grain.

The majority of the polynucleotides within this group encode proteinproducts that are directly involved in or associated with three majorpathways of nutrition partitioning: the synthesis and transport of (1)carbohydrates, (2) proteins, and (3) fatty acids.

Carbohydrates are the most abundant organic molecules in nature andmodulation of their synthesis, accumulation, and storage presents a vasttemplate of possibilities for improving the quality and quantity ofagricultural plants, food crops, consumer health products such asdietary supplements, and many industrial applications. In plants,carbohydrates occur as mono-, di, or polysaccharides and have theessential functions of providing the plant with chemical energy andstructural stability. Although sugar uptake from external sourcesgenerally is not a relevant process, the redistribution of sugar(usually glucose) from photosynthesizing tissues to non-green cells isof major importance. Once translocated to terminal sink storage tissues,sugars are converted to starch and stored in the leucoplasts of seeds,fruits, tubers and roots, as well as actively growing photosynthetictissues. These plant tissues provide the bulk of human dietary intake,and as such, the anabolic pathways of synthesis and assimilation(starch, fatty acids, and nitrogen) are of particular importance toagriculture and commercial industry.

As major contributors to the global carbon cycle, plants and algae bind100 billion metric tons of carbon into carbohydrates each year.Nucleotide sequences encoding at least one polypeptide involved in sugarand carbohydrate metabolism and their end products, as well as thepolypeptides encoded thereby, or an antigene sequences thereof, arecommercially useful materials that can be used to study these processesand to modify these processes to elicit desired modifications in thecompositional and nutritional characteristics of the plant grain.

In particular, the subset of nucleic acid molecules provided herein,which comprises polynucleotides relating to genes that are up-regulatedduring grain filling and involved in carbohydrate transport, synthesis,metabolism, or degradation is a valuable tool box from which anappropriate nucleic acid molecule can be chosen for modifying thequantity and quality of the carbohydrate and sugar content of the grain,respectively. This can be achieved by introducing and overexpressing atleast one polynucleotide from the various subsets of nucleic acidmolecules provided herein in the plant, but preferentially in theapproproate tissues of the plant grain such as, for example, the plantendosperm or by reducing the expression level of the correspondingendogenous gene by methods known in the art including antisense anddsRNAi techniques.

It is thus one of the major objectives of the present invention toidentify and provide a subset of nucleic acid molecules comprising atleast one polynucleotide which encodes a protein that is involved in themetabolism of carbohydrates during grain filling. By modifying theexpression level of at least one of the polynucleotides from thissubgroup in a plant, but preferably in the approproate tissues of theplant grain such as, for example, the plant endosperm, and even morepreferably at an early stage in seed development, it is possible tomodify the carbohydrate composition of the plant grain accordingly.

In one embodiment, the invention thus relates to a polynucleotidecomprising a nucleotide sequence encoding a polypeptide the activity ofwhich is involved in or associated with the synthesis, metabolism ordegradation of carbohydrates in the plant grain and the expression ofwhich is up-regulated during grain filling, which nucleotide sequence issubstantially similar to a sequence encoding a polypeptide as given inthe SEQ ID NOs of table 7 such as SEQ ID NOs: 70-210.

In particular, the invention relates to polynucleotide comprising anucleotide sequence encoding a polypeptide the activity of which isinvolved in or associated with the synthesis, metabolism or degradationof carbohydrates in the plant grain and the expression of which isup-regulated during grain filling, and which is substantially similar,and preferably has at least between 70%, and 99% amino acid sequenceidentity to at least one polypeptide of SEQ ID NOs given in table 7 suchas SEQ ID NOs: 70-210, with any individual number within this range ofbetween 70% and 99% A also being part of the invention.

The invention further relates to polynucleotide comprising a nucleotidesequence encoding a polypeptide the activity of which is involved in orassociated with the synthesis, metabolism or degradation ofcarbohydrates in the plant grain and the expression of which isup-regulated during grain filling, and which is immunologically reactivewith antibodies raised against a polypeptide as given in the SEQ ID NOsof table 7 such as SEQ ID NOs: 70-210.

More particularly, the invention relates to polynucleotide comprising anucleotide sequence

-   -   a) as given in any one of SEQ ID NOs of table 7 such as SEQ ID        NOs: 69-209 or a part thereof which still encodes a partial        length polypeptide having substantially the same activity as the        full-length polypeptide, e.g., at least 50%, more preferably at        least 80%, even more preferably at least 90% to 95% the activity        of the full-length polypeptide;    -   b) having substantial similarity to (a);    -   c) capable of hybridizing to (a) or the complement thereof;    -   d) capable of hybridizing to a nucleic acid comprising 50 to 200        or more consecutive nucleotides of a nucleotide sequence given        in SEQ ID NOs of table 7 such as SEQ ID NOs: 69-209 or the        complement thereof;    -   e) complementary to (a), (b) or (c); and    -   f) which is the reverse complement of (a), (b) or (c).

One of the defining questions in assimilate partitioning isunderstanding how plants regulate the allocation of photosynthatebetween competing sink organs. In addition to the number of competingorgans, and the sink strength of each, exogenous factors such as abioticstress or pathogen infection may also influence partitioning (Bush,Current Opinions in Plant Biology 2:187. (1999)).

Within the present invention a subset of genes could be identified thatare known to be involved in the plant's response to abiotic and/orbiotic stresses and demonstrated to be up-regulated during grainfilling. By providing these genes it is now possible to regulate theexpression levels of the encoded protein products in the plant grainduring the grain filling process by applying methods known in the artincluding overexpressing or down-regulating the nucleic acid molecule ina plant, or preferably a plant seed, thereby modifying the partitioningin the developing grain.

In one aspect, the present invention relates to polynucleotidecomprising a nucleotide sequence encoding a polypeptide the expressionof which is up-regulated during grain filling and the activity of whichis involved in or associated with the plant's response to abiotic and/orbiotic stresses, which nucleotide sequence is substantially similar to asequencen encoding a polypeptide as given in any one of the SEQ ID NOsof table 4 such as SEQ ID NOs: 2-18.

In particular, the invention relates to a polynucleotide comprising anucleotide sequence encoding a polypeptide the expression of which isup-regulated during grain filling and the activity of which is involvedin or associated with the plant's response to abiotic and/or bioticstresses, and which is substantially similar, and preferably has atleast between 70%, and 99% amino acid sequence identity to at least onepolypeptide as given in any one of the SEQ ID NOs of table 4 such as SEQID NOs: 2-18, with any individual number within this range of between70% and 99% also being part of the invention.

The invention further relates to a polynucleotide comprising anucleotide sequence encoding a polypeptide the expression of which isup-regulated during grain filling and the activity of which is involvedin or associated with the plant's response to abiotic and/or bioticstresses, and which is immunologically reactive with antibodies raisedagainst a polypeptide as given in any one of the SEQ ID NOs of table 4such as SEQ ID NOs: 2-18.

More particularly, the invention relates to a polynucleotide comprisinga nucleotide sequence

-   -   a) as given in in any one of the SEQ ID NOs of table 4 such as        SEQ ID NOs: 1-17 or a part thereof which still encodes a partial        length polypeptide having substantially the same activity as the        full-length polypeptide, e.g., at least 50%, more preferably at        least 80%, even more preferably at least 90% to 95% the activity        of the full-length polypeptide;    -   b) having substantial similarity to (a);    -   c) capable of hybridizing to (a) or the complement thereof;    -   d) capable of hybridizing to a nucleic acid comprising 50 to 200        or more consecutive nucleotides of a nucleotide sequence as        given in any one of the SEQ ID NOs of table 4 such as SEQ ID NOs        1-17 or the complement thereof;    -   e) complementary to (a), (b) or (c); and    -   f) which is the reverse complement of (a), (b) or (c).

The regulation of source-sink pathways encompasses complex mechanismsthat integrate the expression of enzymes involved in carbohydrateproduction in source tissue with those involved with utilization in sinktissue. The elucidation of the underlying signal transduction pathwaysof sink-source regulation is of critical importance to the geneticmanipulation of source-sink relations in transgenic plants.

Within the scope of the present invention a subset of genes wasidentified comprising genes that are up-regulated during grain fillingand encode polypeptides with a kinase or phosphatase activity which areknown to be involved in signal transduction pathways.

In a specific embodiment, the present invention provides nucleic acidmolecules such as those represented in SEQ ID NOs: 19-29 that encodeenzymes which exhibit a kinase or phosphatase activity and/or areinvolved in a signalig pathway and are thus key to the ability ofregulating utilization of carbon/sugar sources, and partitioning ofassimilates between source and sink tissues.

The invention thus relates to a polynucleotide comprising a nucleotidesequence encoding a polypeptide which exhibits a kinase or phosphataseactivity and/or are involved in a signal transduction pathway, theexpression of which is up-regulated during grain filling, whichnucleotide sequence is substantially similar to a sequence encoding apolypeptide as given in any one of the SEQ ID NOs of table 5 such as SEQID Nos: 20-30.

More specifically, the invention relates to a polynucleotide comprisinga nucleotide sequence encoding a polypeptide which exhibit a kinase orphosphatase activity and is up-regulated during grain filling and has atleast between 70%, and 99% amino acid sequence identity to at least onepolypeptide as given in any one of the SEQ ID NOs of table 5 such as SEQID NOs: 20-30, with any individual number within this range of between70% and 99% also being part of the invention.

The invention further relates to a polynucleotide comprising anucleotide sequence encoding a polypeptide which exhibit a kinase orphosphatase activity and is up-regulated during gain filling andimmunologically reactive with antibodies raised against a polypeptide asgiven in any one of the SEQ ID NOs of table 5 such as SEQ ID NOs: 20-30.

More particularly, the invention relates to a polynucleotide comprisinga nucleotide sequence

-   -   a) as given in any one of the SEQ ID NOs of table 5 such as SEQ        ID NOs: 19-29 or a part thereof which still encodes a partial        length polypeptide having substantially the same activity as the        full-length polypeptide, e.g., at least 50%, more preferably at        least 80%, even more preferably at least 90% to 95% the activity        of the full-length polypeptide;    -   b) having substantial similarity to (a);    -   c) capable of hybridizing to (a) or the complement thereof,    -   d) capable of hybridizing to a nucleic acid comprising 50 to 200        or more consecutive nucleotides of a nucleotide sequence as        given in any one of the SEQ ID NOs of table 5 such as SEQ ID        NOs: 19-29 or the complement thereof,    -   e) complementary to (a), (b) or (c); and    -   f) which is the reverse complement of (a), (b) or (c).

Regulating the environment-induced carbon status in crop plants,particularly the partitioning in storage organs, provides industry withthe ability to limit or expand growing seasons to better suit commercialmarkets, to enhance the quality and content of food products derivedfrom storage organs or other tissue specific components of crop plants,and modulate many other metabolic pathways in plants (such as nitrogenassimilation, phosphorylation and the activation of regulatory proteins)that effect consumer end use.

Another possibility for modifying the carbohydrate content of the grainis through regulation of the transport of sugars and carbohydratesduring grain filling.

Supplying carbohydrates to sink tissues via apoplastic mechanismsinvolves the release of sucrose into the apoplast by an exporter,cleavage by an extracellular invertase, and uptake of hexose monomers bymonosaccharide transporters.

In one specific embodiment the present invention thus relates to apolynucleotide comprising a nucleotide sequence encoding a polypeptidewith an activity which is involved in or associated with sugar transportand up-regulated during grain filling, which nucleotide sequence issubstantially similar to a sequence encoding a polypeptide as given inany one of the SEQ ID NOs of table 6 such as SEQ ID NOs: 36; 50, and 58.

In particular, the invention relates to a polynucleotide comprising anucleotide sequence encoding a polypeptide with an activity which isinvolved in or associated with sugar transport and up-regulated duringgrain filling and is substantially similar, and preferably has at leastbetween 70%, and 99% amino acid sequence identity to at least onepolypeptide as given in any one of the SEQ ID NOs of table 6 such as SEQID NOs: 36; 50, and 58, with any individual number within this range ofbetween 70% and 99% also being part of the invention.

The invention further relates to a polynucleotide comprising anucleotide sequence encoding a polypeptide with an activity which isinvolved in or associated with sugar transport and up-regulated duringgrain filling and is immunologically reactive with antibodies raisedagainst a polypeptide as given in any one of the SEQ ID NOs of table 6such as SEQ ID NOs: 36; 50, and 58.

More particularly, the invention relates to a polynucleotide comprisinga nucleotide sequence

-   -   a) as given in any one of the SEQ ID NOs of table 6 such as SEQ        ID NOs: 35; 49, and 57 or a pail thereof which still encodes a        partial-length polypeptide having substantially the same        activity as the full-length polypeptide, e.g., at least 50%,        more preferably at least 80%, even more preferably at least 90%        to 95% the activity of the full-length polypeptide;    -   b) having substantial similarity to (a);    -   c) capable of hybridizing to (a) or the complement thereof;    -   d) capable of hybridizing to a nucleic acid comprising 50 to 200        or more consecutive nucleotides of a nucleotide sequence as        given in any one of the SEQ ID NOs of table 6 such as SEQ ID        NOs: 35; 49, and 57 or the complement thereof;    -   e) complementary to (a), (b) or (c); and    -   f) which is the reverse complement of (a), (b) or (c).

Transmembrane transport of sugars has been demonstrated by the presenceof transporter genes for a few crop species (spinach, potato). For theuses and application of modifying sugar transport mechanisms with regardto controlling the timing and extent of grain fill durations, weincorporate all relevant sections of PCT Publication WO9953068 to Allenet al., and for uses and application of modifying cells or plastidsinvolved in hexose carrier proteins we incorporate all relevant sectionsof PCT Publication WO9953082 to Allen et al.

Glucosyl equivalents for starch biosynthesis are found within the scopeof the present invention to be transported into the plastid (amyloplast)either as glucose-1-phosphate via a hexose-phosphate-Pi transporter (arepresentative example of which is given in SEQ ID NO: 35), as triosephosphates via a triose-phosphate-Pi translocator (a representativeexample of which are given in SEQ ID NO: 163), as phosphoenolpyruvatevia a PEP-Pi translocator (SEQ ID NOs: 175), or as ADP-glucose via aBrittle-like adenylate translocator or via an oxoglutarate/malatetransporter. One isoform of a triose-phosphate/phosphate translocator(SEQ ID NO: 163) is expressed to a slightly higher level during earlierstages of grain development.

Pyruvate appears to play a more important role during early stages ofgrain development in that a gene encoding an isoform of a PEP-Pitranslocator (SEQ ID NO: 175) is relatively more highly expressed atthis stage. In maize endosperm, the majority of glucosyl moieties aretransported to the amyloplast during the linear phase of starchaccumulation as ADP-glucose (J. C. Shannon et al., Plant Physiol. 117,1235 (1998)).

For uses and application of modifying amyloplasts in the regulation ofstarch production via an ADP glucose transporter, we incorporate allrelevant sections of PCT Publication WO9947681 to Emes et al.

Further examples of genes encoding a sugar transporter are provided inSEQ ID NOs: 35; 49, and 57. By providing the nucleic acid moleculesaccording to the invention encoding sugar transporters the expression ofwhich is upregulated during grain filling such as those given in SEQ IDNOs: 36; 50, and 58; 36385; 53483; it is now possible to manipulate thetranslocation and storage of sugars and their carbohydrate end productsin the plant grain.

In still another embodiment the present invention provides furthersubset of nucleic acid molecules which are up-regulated during grainfilling comprising a nucleotide sequence encoding a polypeptide that hasa transmembrane domain and assists in the transport of amino acids andinorganic compounds including nitrate and various cations, whichnucleotide sequence is substantially similar to a sequence encoding apolypeptide as given in SEQ ID NOs: 32; 38; 40; 42; 44; 46; 48; 52; 54;56; 60; 62; 64, 66; and 68.

In particular, the invention relates to a polynucleotide comprising anucleotide sequence encoding a polypeptide, that has a transmembranedomain and assists in the transport of amino acids and inorganiccompounds including nitrate and various cations and is up-regulatedduring grain filling and is substantially similar, and preferably has atleast between 70%, and 99% amino acid sequence identity to at least onepolypeptide of SEQ ID NOs: 32; 38; 40; 42; 44; 46; 48; 52; 54; 56; 60;62; 64, 66; and 68, with any individual number within this range ofbetween 70% and 99% also being part of the invention.

The invention further relates to a polynucleotide comprising anucleotide sequence encoding a polypeptide, that has a transmembranedomain and assists in the transport of amino acids and inorganiccompounds including nitrate and various cations and is up-regulatedduring grain filling and is immunologically reactive with antibodiesraised against a polypeptide of SEQ ID NOs: 32; 38; 40; 42; 44; 46; 48;52; 54; 56; 60; 62; 64, 66; and 68.

More particularly, the invention relates to a polynucleotide comprisinga nucleotide sequence

-   -   a) as given in any one of SEQ ID NOs: 31; 37; 39; 41; 43; 45;        47; 51; 53; 55; 59; 612; 63, 65; and 67 or a part thereof which        still encodes a partial-length polypeptide having substantially        the same activity as the full-length polypeptide, e.g., at least        50%, more preferably at least 80%, even more preferably at least        90% to 95% the activity of the full-length polypeptide;    -   b) having substantial similarity to (a);    -   c) capable of hybridizing to (a) or the complement thereof;    -   d) capable of hybridizing to a nucleic acid comprising 50 to 200        or more consecutive nucleotides of a nucleotide sequence given        in SEQ ID NO: 31; 37; 39; 41; 43; 45; 47; 51; 53; 55; 59; 612;        63, 65; and 67, or the complement thereof;    -   e) complementary to (a), (b) or (c); and

f) which is the reverse complement of (a), (b) or (c).

In particular, the invention provides a nucleic acid molecule which isup-regulated during grain filling and comprises a nucleotide sequenceencoding a polypeptide that belongs to the POT or PTR family.

Proteins of the POT family (also called the PTR (peptide transport)family) consists of proteins from animals, plants, yeast, archaea, andboth Gram-negative and Gram-positive bacteria. Several of theseorganisms possess multiple POT family paralogues. The proteins are ofabout 450-600 amino acyl residues in length with the eukaryotic proteinsin general being longer than the bacterial proteins. They exhibit 12putative or established transmembrane ?-helical spanners. Some membersof the POT family exhibit limited sequence similarity to protein membersof the major facilitator superfamily (MFS; TC #2.A.1). (Comparisonscores of up to 8 standard deviations for segments in excess of 60residues in length.) Thus the POT family is probably a family within theMFS.

While most members of the POT family catalyze peptide transport, one isa nitrate permease and one can transport histidine as well as peptides.Some of the peptide transporters can also transport antibiotics. Theyfunction by proton symport, but the substrate:H⁺ stoichiometry isvariable: the high affinity rat PepT2 carrier catalyzes uptake of 2 and3H⁺ with neutral and anionic dipeptides, respectively, while the lowaffinity PepT1 carrier catalyzes uptake of one H+ per neutral peptide.In eukaryotes, some of these transporters may be in organellar membranessuch as the lysosomes.

The generalized transport reaction catalyzed by the proteins of the POTfamily is:substrate (out)+nH⁺(out)--->substrate (in)+nH⁺(in).

In a specific embodiment, the present invention relates to an isolatednucleic acid molecule which is up-regulated during grain filling andcomprises a nucleotide sequence encoding a polypeptide that belongs tothe POT or PTR family, which nucleotide sequence is substantiallysimilar to a sequence encoding a polypeptide as given in SEQ ID NOs: 38;52, and 68.

In particular, the invention relates to an isolated nucleic acidmolecule comprising a nucleotide sequence encoding a polypeptide, whichbelongs to the POT or PTR family and up-regulated during grain fillingand is substantially similar, and preferably has at least between 70%,and 99% amino acid sequence identity to at least one polypeptide of SEQID NOs: 38; 52, and 68, with any individual number within this range ofbetween 70% and 99% also being part of the invention.

The invention further relates to an isolated nucleic acid moleculecomprising a nucleotide sequence encoding a polypeptide, which belongsto the POT or PTR family and up-regulated during grain filling and isimmunologically reactive with antibodies raised against a polypeptide ofSEQ ID NOs: 38; 52, and 68.

More particularly, the invention relates to an isolated nucleic acidmolecule comprising a nucleotide sequence

-   -   a) as given in any one of SEQ ID NOs: 37; 51, and 67 or a part        thereof which still encodes a partial-length polypeptide having        substantially the same activity as the full-length polypeptide,        e.g., at least 50%, more preferably at least 80%, even more        preferably at least 90% to 95% the activity of the full-length        polypeptide;    -   b) having substantial similarity to (a);    -   c) capable of hybridizing to (a) or the complement thereof;    -   d) capable of hybridizing to a nucleic acid comprising 50 to 200        or more consecutive nucleotides of a nucleotide sequence given        in SEQ ID NO: 37; 51, and 67 or the complement thereof;    -   e) complementary to (a), (b) or (c); and    -   f) which is the reverse complement of (a), (b) or (c).

One of the economically most important and valuable carbohydrate endproducts is starch, which is an essential component of many food, feed,and industrial products. It consists of two types of glucan polymers:relatively long chained polymers with few branches known as amylose, andshorter chained but highly branched molecules called amylopectin.

Its biosynthesis depends on the complex interaction of multiple enzymes(Smith, A. et al., (1995) Plant Physio. 107:673-677; Preiss, J., (1988)Biochemistry of Plants 14:181-253). One of the key enzymes in starchbiosynthesis is ADP-glucose pyrophosphorylase, which catalyzes theformation of ADP-glucose; a series of starch synthases which use ADPglucose as a substrate for polymer formation using alpha.-1-4 linkages;and several starch branching enzymes, which modify the polymer bytransferring segments of polymer to other parts of the polymer usingalpha.-1-6 linkages, creating branched structures. However, based ondata from starch forming plants such as potato, and corn, it is becomingclear that other enzymes also play a role in the determination of thefinal structure of starch. In particular, debranching anddisproportionating enzymes not only participate in starch degradation,but also in modification of starch structure during its biosynthesis.Different models for this action have been proposed, but all share theconcept that such activities, or lack thereof, change the structure ofthe starch produced.

In plants used typically for the production of starch, such as maize orpotato, the synthesized starch consists of approximately 25%amylose-starch and of about 75% amylopectin-starch.

With respect to the homogeneity of the basic component starch for itsuse in the industrial area, starch-producing plants are needed whichcontain, for example, only the component amylopectin or only thecomponent amylose. For a number of other uses plants are needed thatsynthesize amylopectin types with different degrees of branchings.

Such plants may for example be obtained by breeding or by means ofmutagenesis techniques. It is known for various plant species, such asfor maize, that by means of mutagenesis varieties may be produced inwhich only amylopectin is formed. Also in the case of potato a genotypewas produced from a haploid line by means of chemical mutagenesis. Saidgenotype does not form amylose (Hovenkamp-Hermelink, Theor. Appl. Genet.75 (1987), 217-221).

Apart from conventional breeding and mutagenesis techniques, recombinantDNA techniques are now increasingly used in order to specificallyinterfere with the starch metabolism of starch storing plants. Aprerequisite for this is that DNA sequences be provided which encodeenzymes involved in the starch metabolism.

The present invention now provides a subset of nucleic acid moleculesthat are involved in the starch biosynthesis pathway and were shown tobe up-regulated during grain filling. Representative examples of thosesubset genes are provided in SEQ ID NOs: 69-187 of the Sequence Listing.

In a particular embodiment, the present invention relates to apolynucleotide comprising a nucleotide sequence encoding a polypeptidewhich is involved in associated with starch biosynthsis and up-regulatedduring grain filling, which nucleic acid molecule is substantiallysimilar to a nucleic acid encoding a polypeptide as given in any one ofthe SEQ ID NOs of table 7 such as SEQ ID NOs: 70-188.

More specifically, the invention relates to a polynucleotide comprisinga nucleotide sequence encoding a polypeptide, which is involved in orassociated with starch biosynthesis and up-regulated during grainfilling and is substantially similar, and preferably has at leastbetween 70%, and 99% amino acid sequence identity to at least onepolypeptide as given in any one of the SEQ ID NOs of table 7 such as SEQID NOs: 70-188, with any individual number within this range of between70% and 99% also being part of the invention.

The invention further relates to a polynucleotide comprising anucleotide sequence encoding a polypeptide, which is involved in orassociated with starch biosynthesis and up-regulated during grainfilling and is immunologically reactive with antibodies raised against apolypeptide as given in any one of the SEQ ID NOs of table 7 such as SEQID NOs: 70-188.

More particularly, the invention relates to a polynucleotide comprisinga nucleotide sequence

-   -   a) as given in any one of the SEQ ID NOs of table 7 such as SEQ        ID NOs: 69-187 or a part thereof which still encodes a        partial-length polypeptide having substantially the same        activity as the full-length polypeptide, e.g., at least 50%,        more preferably at least 80%, even more preferably at least 90%        to 95% the activity of the fill-length polypeptide;    -   b) having substantial similarity to (a);    -   c) capable of hybridizing to (a) or the complement thereof;    -   d) capable of hybridizing to a nucleic acid comprising 50 to 200        or more consecutive nucleotides of a nucleotide sequence as        given in any one of the SEQ ID NOs of table 7 such as SEQ ID        NOs: 69-187, or the complement thereof,    -   e) complementary to (a), (b) or (c); and    -   f) which is the reverse complement of (a), (b) or (c).

By providing a subset of genes encoding polypeptides that are involvedin starch metabolism it is now possible to interfere with starchmetabolism to produce starch with modified physico/chemicalcharacteristics.

A gene encoding the small subunit of ADPG pyrophosphorylase (SEQ ID NO:138); is expressed at early stages of grain development in conjunctionwith a single gene encoding a large subunit (SEQ ID NO: 140). Threeother large subunits (SEQ ID NOs: 136; 142); are up-regulated at a laterstage in development from 4 days after anthesis, in conjunction with theup regulation of the starch synthase genes (SEQ ID NOs: 129; 131; and133) and two genes for branching enzymes (SEQ ID NOs: 70; and 72)(involved in amylose and amylopectin biosynthesis, respectively). Onlyone (distinct from the two mentioned above) of the small subunit genesincreases in this time period. The expression of different isoforms maybe related to the shift to storage starch production and a postulatedconcomitant shift to cytoplasmic ADP-glucose production (Stark, D. M.,et al., “Regulation of the Amount of Starch in Plant Tissues by ADPGlucose Pyrophosphorylase”, Science, 258,287-291 (Oct. 9, 1992)).

In one embodiment the present invention provides a nucleic acid moleculecomprising a nucleotide sequence which encodes a small subunit of ADPGpyrophosphorylase. In another embodiment the invention provides anucleic acid molecule comprising a nucleotide sequence which encodes alarge subunit of ADPG pyrophosphorylase.

In particular, the invention relates to a polynucleotide comprising anucleotide sequence encoding a polypeptide with an activity of a smalland large subunit ADPG pyrophosphorylase, respectively, which nucleotidesequence is substantially similar to a nucleic acid sequence encoding apolypeptide as given in SEQ ID NOs: 136-142.

More specifically, the invention relates to a polynucleotide comprisinga nucleotide sequence encoding a polypeptide with an activity of a smalland large subunit ADPG pyrophosphorylase, respectively, which isup-regulated during grain filling and has at least between 70%, and 99%amino acid sequence identity to at least one polypeptide of SEQ ID NOs:136-142, with any individual number within this range of between 70% and99% also being part of the invention.

The invention further relates to a polynucleotide comprising anucleotide sequence encoding a polypeptide with an activity of a smalland large subunit ADPG pyrophosphorylase, respectively, which isup-regulated during grain and immunologically reactive with antibodiesraised against a polypeptide of SEQ ID NOs: 136-142.

More particularly, the invention relates to a polynucleotide comprisinga nucleotide sequence

-   -   a) as given in any one of SEQ ID NOs: SEQ ID NOs: 135-141 or a        part thereof which still encodes a partial-length polypeptide        having substantially the same activity as the full-length        polypeptide, e.g., at least 50%, more preferably at least 80%,        even more preferably at least 90% to 95% the activity of the        full-length polypeptide;    -   b) having substantial similarity to (a);    -   c) capable of hybridizing to (a) or the complement thereof;    -   d) capable of hybridizing to a nucleic acid comprising 50 to 200        or more consecutive nucleotides of nucleotides given in SEQ ID        NO: SEQ ID NOs: 135-141, or the complement thereof;    -   e) complementary to (a), (b) or (c); and    -   f) which is the reverse complement of (a), (b) or (c).

The nucleic acid molecules of the instant invention may be used tocreate transgenic plants in which the small and/or large subunits ofADPG pyrophosphorylase are present at higher or lower levels than normalor in cell types or developmental stages in which it is not normallyfound. This may have the effect of altering starch structure in thosecells or tissues but especially in the developing grain.

For a further targeted modification of the starch in plants, inparticular of the degree of branching of starch synthesized in plants bymeans of recombinant DNA techniques, it is still necessary to identifyDNA sequences that encode enzymes participating in the starchmetabolism, particularly in the branching of starch molecules.

In the case of potato, for example, DNA sequences have by now beendescribed which encode a granule-bound starch synthase or a branchingenzyme (Q enzyme), and they have been used in order to geneticallymodify plants.

Apart from the Q enzymes that introduce branchings into starchmolecules, enzymes occur in plants which are capable of dissolvingbranchings. These enzymes are called debranching enzymes.

In the case of sugar beet, Li et al. (Plant Physiol. 98 (1992),1277-1284) could only prove the occurrence of one debranching enzyme,apart from five endo- and two exoamylases. This enzyme having a size ofapproximately 100 kD and an optimum pH value of 5.5 is located withinthe chloroplasts. A debranching enzyme was also described for spinach.The debranching enzyme from spinach as well as that from sugar beetexhibit a fivefold lower activity in a reaction with amylopectin assubstrate when compared to a reaction with pullulan as a substrate(Ludwig et al., Plant Physiol. 74 (1984), 856-861; Li et al., PlantPhysiol. 98 (1992), 1277-1284). The isolation of a cDNA encoding adebranching enzyme was described for spinach (Renz et al., PlantPhysiol. 108 (1995), 1342).

The existence of a debranching enzyme for maize has been described inthe prior art. The corresponding mutant was designated su (sugary). Thegene of the sugary locus was cloned recently (see James et al., PlantCell 7 (1995), 417-429). In the case of the agriculturally significantstarch storing cultured plant potato, the activity of a debranchingenzyme was examined by Hobson et al. (J. Chem. Soc., (1951), 1451). Itwas proven that the respective enzyme, contrary to the Q enzyme, doesnot exhibit any activities leading to an elongation of thepolysaccharide chain, but merely hydrolyses .alpha.-1,6-glycosidicbonds.

Within the scope of the present invention a subset of genes is providedthat encode polypeptides the activity of which is associated with thestructural shaping of the starch granule. In particular, the inventionprovides a subset of genes that encode polypeptides the activity ofwhich is associated the branching/debranching (representative examplesof wich are given in SEQ ID NOs: 69-73/75; 77 (isoamylase debranchingenzyme)) and/or degradation of starch (a-amylase (SEQ ID NO: 79-91),pullulanase (SEQ ID NO: 109) [the last gene in the a-amylase series],a-amylase inhibitor (SEQ ID NOs: 93-99); β-amylase (SEQ ID NO101-107),a-glucosidase (SEQ ID NO: 111-117). By modulating the expression of thepolypeptides according to the invention, the amylose amylopectin ratiocan be changed in order to accommodate the varying quality standards forfood and/or feed applications or specific processing requirements. Forexample, by over-expressing and inhibiting the expression of endogeneousbranching and/or debranching enzyme genes in rice or any other cerealcrop plant, respectively, a plant can be produced that exhibitsincreased or reduced amounts of branching/debranching enzyme activityfor the purpose of modifying the degree of branching of the amylopectinstarch.

By inhibiting the expression of endogeneous branching and/or debranchingenzyme genes, plants are produced that exhibit a reduced activity ofthese enzymes, which leads to the synthesis of a modified starch.Inhibition of branching/debranching gene expression can be achieved byapplying method known in the art such as, for example, antisense ordsRNAi techniques. By applying these techniques it is possible toproduce plants in which the expression of an endogeneousbranching/debranching enzyme gene in rice or any other cereal crop plantis inhibited to different degrees within the range of 0.1% to 100%,which all individual numbers within this range also being part of theinvention. This enables in particular the production of cereal plantssynthesizing amylopectin starch with most various variations of thedegree of branching. This constitutes an advantage with regard toconventional breeding and mutagenesis techniques in which a lot of timeand costs are required in order to provide such a variety. Highlybranched amylopectin has a particularly large surface and is thereforeparticularly suitable as a copolymer. A high degree of branchingfurthermore leads to an improvement of the amylopectin's solubility inwater. This property is very advantageous for certain technicalapplications.

Another way of modifying the branching characteristics of starch is byoverexpressing the nucleic acid molecule according to the inventionencoding a branching/debranching enzyme activity in rice in a transgenicplant, but especially a plant seed.

The expression of a novel or additional branching/debranching enzymeactivity from rice in the transgenic plant cells and plants of theinvention influences the degree of branching of the amylopectinsynthesized in the cells and plants. Therefore, a starch synthesized inthese plants exhibits modified physical and/or chemical properties whencompared to starch from wildtype plants.

Genes encoding products involved in starch structure rearrangement(debranching enzyme is (SEQ ID NO: 75-77 (isoamylase debranchingenzyme)); branching enzyme (SEQ ID NOs: 69-73)) and starch degradation(a-amylase (SEQ ID NOs 79-91), a-amylase inhibitor (SEQ ID NOs: 93-99);pullulanase (SEQ ID NOs 109) [the last gene in the a-amylase series],β-amylase (SEQ ID NOs 101-107), a-glucosidase (SEQ ID NOs 111-117)) areall strongly expressed towards the end of grain development, reflectingtheir involvement in the final stages of shaping the starch granule.Genes encoding isoforms of an a-amylase inhibitor (SEQ ID NOs: 93 and95) are expressed most strongly in the aleurone and seed coat layers,and endosperm and not (or to a reduced extent) in the embryo. The embryoalso shows a different expression of genes encoding starch synthase andbranching enzymes, perhaps reflecting its status as an energy-requiringsink organ rather than as a storage tissue. Myers et al. discuss theinteraction of starch synthases, branching enzymes, debranching enzymesand disproportionating enzymes in producing and trimming glucanmolecules so that a final transition may take place to a crystallineform (A. M. Myers, M. K. Morell, M. G. James, S. G. Ball. Plant Physiol.122, 989 (2000)).

In a further embodiment, the present invention provides the ability tomodulate the shape and the physico/chemical properties of the starchgranule by modifying expression level and pattern of those genes thatencode products involved in starch structure rearrangement such as, forexample, SEQ ID NO: 75-77 (isoanylase debranching enzyme); branchingenzyme (SEQ ID NOs: 69-73) and starch degradation (a-amylase (SEQ ID NOs79-91)), a-amylase inhibitor (SEQ ID NOs: 93-99); pullulanase (SEQ IDNO: 109), β-amylase (SEQ ID NO: 101-107), and/or a-glucosidase (SEQ IDNO: 111-117).

The invention thus also relates to a polynucleotide comprising anucleotide sequence encoding a polypeptide involved in starch structurerearrangement, which nucleic acid molecule is substantially similar to anucleic acid encoding a polypeptide as given in the SEQ ID NOs of table7 such as SEQ ID NOs: 75-77 exhibiting isoamylase debranching enzymeactivity, 69-73 exhibiting a branching enzyme activity, 80-92 exhibitingan a-amylase activity; 94-100 exhibiting an a-amylase inhibitoractivity; 110 exhibiting a pullulanase activity; 102-108, exhibiting aβ-amylase activity; 112-118, exhibiting a a-glucosidase activity.

More specifically, the invention relates to a polynucleotide comprisinga nucleotide sequence encoding a polypeptide which is involved in starchstructure rearrangement and up-regulated during grain filling and has atleast between 70%, and 99% amino acid sequence identity to at least onepolypeptide as given in the SEQ ID NOs of table 7 such as SEQ ID NOs:75-77 exhibiting isoamylase debranching enzyme activity, 69-73exhibiting a branching enzyme activity, 80-92, 80-92 exhibiting ana-amylase activity; 94-100 exhibiting an a-amylase inhibitor activity;110 exhibiting a pullulanase activity; 102-108, exhibiting a β-amylaseactivity; 112-118, exhibiting a a-glucosidase activity with anyindividual number within this range of between 70% and 99% also beingpart of the invention.

The invention further relates to a polynucleotide comprising anucleotide sequence encoding a polypeptide which is involved in starchstructure rearrangement and up-regulated during grain filling andimmunologically reactive with antibodies raised against a polypeptide asgiven in the SEQ ID NOs of table 7 such as SEQ ID NOs: 75-77 exhibitingisoamylase debranching enzyme activity, 69-73 exhibiting a branchingenzyme activity, 80-92, 80-92 exhibiting an a-amylase activity; 94-100exhibiting an a-amylase inhibitor activity; 110 exhibiting a pullulanaseactivity; 102-108, exhibiting a β-amylase activity; 112-118, exhibitinga a-glucosidase activity.

More particularly, the invention relates to a polynucleotide comprisinga nucleotide sequence

-   -   a) as given in the SEQ ID NOs of table 7 such as SEQ ID NOs:        75-77 exhibiting isoamylase debranching enzyme activity, 69-73        exhibiting a branching enzyme activity, 79-91 exhibiting an        a-amylase activity; 93-99 exhibiting an a-amylase inhibitor        activity; 109 exhibiting a pullulanase activity; 101-107,        exhibiting a 6-amylase activity; 111-117, exhibiting a        a-glucosidase activity or a part thereof which still encodes a        partial-length polypeptide having substantially the same        activity as the fill-length polypeptide, e.g., at least 50%,        more preferably at least 80%, even more preferably at least 90%        to 95% the activity of the full-length polypeptide;    -   b) having substantial similarity to (a);    -   c) capable of hybridizing to (a) or the complement thereof;    -   d) capable of hybridizing to a nucleic acid comprising 50 to 200        or more consecutive nucleotides of a nucleotide sequence given        as given in the SEQ ID NOs of table 7 such as SEQ ID NOs: 75-77        exhibiting isoamylase debranching enzyme activity; 69-73        exhibiting a branching enzyme activity, 79-91 exhibiting an        a-amylase activity; 93-99 exhibiting an a-amylase inhibitor        activity; 109 exhibiting a pullulanase activity; 101-107,        exhibiting a β-amylase activity; 111-117, exhibiting a        a-glucosidase activity, or the complement thereof;    -   e) complementary to (a), (b) or (c); and    -   f) which is the reverse complement of (a), (b) or (c).

The identification of a defined subset of genes that are involved incarbohydrate metabolism but especially in starch metabolism and theexpression of which is coordinately up- or down-regulated during thegrain filling process makes it now possible to improve grain quality byoverexpressing and/or underexpressing or completely knocking out genesthat are known to positively contribute to the nutritional or processingproperties of grains such as, for example, genes encoding productsinvolved in starch structure rearrangement and starch degradation asmentioned hereinbefore.

The expression of a-amylase, which is central in the starch biosynthesispathway, may further be modified to obtain plants producing a desirablecontent of reducing sugars. For, example, a high content of reducingsugar resulting from a high α-amylase activity is desirable when rice orother cereal plants are to be used for the production of alcohol. Thiscan be achieved by modifying the expression of the plant endogenousgenes encoding an α-amylase or α-amylase inhibitor activity, forexample, by introducing and overexpressing in a target plant a nucleicacid molecule comprising a nucleotide sequence that encodes apolypeptide the amino acid sequence of which is substantially similar toany one of those given in SEQ ID NOs: 80-92 exhibiting an a-amylaseactivity; and 94-100 exhibiting an a-amylase inhibitor activity.

In the specific embodiment, the invention thus also relates to apolynucleotide comprising a nucleotide sequence encoding a polypeptideexhibiting an amylase or an amylase inhibitor activity, which nucleicacid molecule is substantially similar to a nucleic acid encoding apolypeptide as given in the SEQ ID NOs of table 7 such as SEQ ID NOs:80-92 exhibiting an a-amylase activity; and 94-100 exhibiting ana-amylase inhibitor activity.

More specifically, the invention relates to a polynucleotide comprisinga nucleotide sequence encoding a polypeptide which has an activity of anamylase and is up-regulated during grain filling and has at leastbetween 70%, and 99% amino acid sequence identity to at least onepolypeptide as given in the SEQ ID NOs of table 7 such as SEQ ID NOs:80-92 exhibiting an a-amylase activity; and 94-100 exhibiting ana-amylase inhibitor activity, with any individual number within thisrange of between 70% and 99% also being part of the invention.

The invention further relates to a polynucleotide comprising anucleotide sequence encoding a polypeptide which which has an activityof an amylase and is up-regulated during grain filling andimmunologically reactive with antibodies raised against a polypeptide asgiven in the SEQ ID NOs of table 7 such as SEQ ID NOs: 80-92 exhibitingan a-amylase activity; and 94-100 exhibiting an a-amylase inhibitoractivity.

More particularly, the invention relates to a polynucleotide comprisinga nucleotide sequence

-   -   a) as given in the SEQ ID NOs of table 7 such as SEQ ID NOs:        79-91 exhibiting an a-amylase activity; and 93-99 exhibiting an        a-amylase inhibitor activity or a part thereof which still        encodes a partial length polypeptide having substantially the        same activity as the full-length polypeptide, e.g., at least        50%, more preferably at least 80%, even more preferably at least        90% to 95% the activity of the full-length polypeptide;    -   b) having substantial similarity to (a);    -   c) capable of hybridizing to (a) or the complement thereof;    -   d) capable of hybridizing to a nucleic acid comprising 50 to 200        or more consecutive nucleotides of a nucleotide sequence as        given in the SEQ ID NOs of table 7 such as SEQ ID NOs: 79-91        exhibiting an a-amylase activity; and 93-99 exhibiting an        a-amylase inhibitor activity or the complement thereof;    -   e) complementary to (a), (b) or (c); and    -   f) which is the reverse complement of (a), (b) or (c).

Different isoforms often show distinct spatial expression patterns. Forexample, three different sucrose synthase isoforms (SEQ ID NOs: 119-123)are expressed in developing grain tissue, two of which (SEQ ID NOs: 121and 123) are expressed more highly at the start of grain development (0days post anthesis) and one (SEQ ID NO: 119) which is up-regulatedtowards the end of grain development. The spatial distribution of eachdiffers. Other isoforms (SEQ ID NOs: 125 and 127), showing lowexpression in the grain, are expressed strongly in stems or roots.

The invention thus also relates to a polynucleotide comprising anucleotide sequence encoding a polypeptide exhibiting a sucrose synthaseactivity, which nucleic acid molecule is substantially similar to anucleic acid encoding a polypeptide as given in SEQ ID NOs: 120-128.

More specifically, the invention relates to a polynucleotide comprisinga nucleotide sequence encoding a polypeptide which has an activity of ansucrose synthase and is up-regulated during grain filling and has atleast between 70%, and 99% amino acid sequence identity to at least onepolypeptide of SEQ ID NOs: 120-128, with any individual number withinthis range of between 70% and 99% also being part of the invention.

The invention further relates to a polynucleotide comprising anucleotide sequence encoding a polypeptide which which has an activityof a sucrose synthase and is up-regulated during grain filling andimmunologically reactive with antibodies raised against a polypeptide ofSEQ ID NOs: 120-128.

More particularly, the invention relates to a polynucleotide comprisinga nucleotide sequence

-   -   a) as given in any one of SEQ ID NOs: 119-127 or a part thereof        which still encodes a partial-length polypeptide having        substantially the same activity as the full-length polypeptide,        e.g., at least 50%, more preferably at least 80%, even more        preferably at least 90% to 95% the activity of the full-length        polypeptide;    -   b) having substantial similarity to (a);    -   c) capable of hybridizing to (a) or the complement thereof;    -   d) capable of hybridizing to a nucleic acid comprising 50 to 200        or more consecutive nucleotides of a nucleotide sequence given        in SEQ ID NOs: 119-127 or the complement thereof,    -   e) complementary to (a), (b) or (c); and    -   f) which is the reverse complement of (a), (b) or (c).

In a further embodiment, the present invention provides the ability toregulate glucanases (as represented by SEQ ID NO: 191). Glucanases canbe used to minimize wet droppings in high wheat, or barley, poultry andswine diets by breaking down and reducing the viscosity of β-glucans andother non-starch polysaccharides and thus can provide benefit as aprocessing aid in animal feed. For uses and application of modifyingcrop plants by creating transgenic monocots and monocot seeds expressingrice β-glucanase enzymes and genes we incorporate all relevant sectionof PCT Publication WO9859046 to Rodriguez.

The invention thus also relates to a polynucleotide comprising anucleotide sequence encoding a polypeptide exhibiting a glucanaseactivity, which nucleic acid molecule is substantially similar to anucleic acid encoding a polypeptide as given in SEQ ID NOs: 192.

More specifically, the invention relates to a polynucleotide comprisinga nucleotide sequence encoding a polypeptide which has an activity of anglucanase and is up-regulated during grain filling and has at leastbetween 70%, and 99% amino acid sequence identity to at least onepolypeptide of SEQ ID NOs: 192, with any individual number within thisrange of between 70% and 99% also being part of the invention.

The invention further relates to a polynucleotide comprising anucleotide sequence encoding a polypeptide which which has an activityof a glucanase and is up-regulated during grain filling andimmunologically reactive with antibodies raised against a polypeptide ofSEQ ID NOs: 192.

More particularly, the invention relates to a polynucleotide comprisinga nucleotide sequence

-   -   a) as given in SEQ ID NO: 191 or a part thereof which still        encodes a partial length polypeptide having substantially the        same activity as the full-length polypeptide, e.g., at least        50%, more preferably at least 80%, even more preferably at least        90% to 95% the activity of the full-length polypeptide;    -   b) having substantial similarity to (a);    -   c) capable of hybridizing to (a) or the complement thereof,    -   d) capable of hybridizing to a nucleic acid comprising 50 to 200        or more consecutive nucleotides of nucleotides given in SEQ ID        NO: 191 or the complement thereof;    -   e) complementary to (a), (b) or (c); and    -   f) which is the reverse complement of (a), (b) or (c).

Thus, in an embodiment applicable to all of the above stated provisions,the present invention provides nucleotide sequences encoding at leastone polypeptide involved in the synthesis, metabolism, transport orstorage of carbohydrates, as well as any polypeptides encoded thereby,or any antigene sequences thereof, which have numerous applicationsusing techniques that are known to those skilled in the art of molecularbiology, biotechnology, biochemistry, genetics, physiology or pathology.These techniques include the use of nucleotide molecules ashybridization probes, for chromosome and gene mapping, in PCRtechnologies, in the production of sense or antisense nucleic acids, inscreening for new therapeutic molecules, in production of plants andseeds having desirable, inheritable, commercially useful phenotypes, orin discovery of inhibitory compounds.

In a further collective embodiment, the present invention provides theability to modulate carbohydrates, sugars and their transporters inplant tissues, by over-expressing, under-expressing or knocking out oneor more cell cycle genes or their gene products, in a plant cell, invitro or in planta. Expression vectors comprising at least onenucleotide sequence involved in carbohydrate or sugar synthesis,metabolism, transport or storage, or any antigenes thereof, operablylinked to at least one suitable promoter and/or regulatory sequence canbe used to study the role of polypeptides encoded by said sequences, forexample by transforming a host cell with said expression vector andmeasuring the effects of overexpression and underexpression ofsequences. A host cell transformed with at least one expression vectorcomprising nucleotide sequences involved in carbohydrate modulation,operably linked to suitable promoters and/or regulatory sequences, canbe useful to produce a dietary supplement comprising a polypeptidehaving a defined amino acid profile.

In a further collective embodiment, the present invention provides atransformed plant host cell, or one obtained through breeding, capableof over-expressing, under-expressing, or having a knock out of saidmetabolic genes and/or their gene products.

Such a plant cell, transformed with at least one expression vectorcomprising nucleotide sequences involved in carbohydrate synthesis,metabolism, transport or storage, operably linked to suitable promotersand/or regulatory sequences, can be used to regenerate plant tissue oran entire plant, or seed there from, in which the effects of expression,including overexpression or underexpression, of the introduced sequenceor sequences can be measured in vitro or in planta.

A further subset of genes provided herein comprises genes that encodepolypeptides with an activity that is involved in or associated with theproduction of seed storage proteins.

In seeds of higher plants, proteins are contained in an amount of 20-30%by weight in case of beans, and in an amount of about 10% by weight incase of cereals, based on dry weight. Among the proteins in seeds,70-80% by weight are storage proteins. Particularly, in rice seeds,about 80% by weight of the seed storage proteins is glutelin which isonly soluble in dilute acids and dilute alkalis. The remainders areprolamin (10-15% by weight) soluble in organic solvents and globulin(5-10% by weight) solubilized by salts.

Seed storage proteins are important as a protein source in foods andfeeds, so that they have been well studied from the view points ofnutrition and protein chemistry. As a result, in cereals, storageprotein genes of maize, wheat, barley and the like have been cloned,amino acid sequences of the proteins have been deduced from thenucleotide sequence, and regulatory regions of the genes have beenanalyzed.

The present invention provides a subset of nucleic acid molecules thatis up-regulated during grain filling and comprises a nucleotide sequenceencoding a seed storage protein. Representative examples of these genesare given in SEQ ID NOs: 211-249.

The invention thus also relates to a polynucleotide comprising anucleotide sequence encoding a seed storage protein, which nucleic acidmolecule is substantially similar to a nucleic acid encoding apolypeptide as given in any one of the SEQ ID NOs of table 8 such as SEQID NOs: 212-250.

More specifically, the invention relates to a polynucleotide comprisinga nucleotide sequence encoding a seed storage protein which isup-regulated during grain filling and has at least between to 70%, and99% amino acid sequence identity to at least one polypeptide as given inany one of the SEQ ID NOs of table 8 such as SEQ ID NOs: 212-250, withany individual number within this range of between 70% and 99% alsobeing part of the invention.

The invention further relates to a polynucleotide comprising anucleotide sequence encoding a seed storage protein, which isup-regulated during grain filling and immunologically reactive withantibodies raised against a polypeptide as given in any one of the SEQID NOs of table 8 such as SEQ ID NOs: 212-250.

More particularly, the invention relates to a polynucleotide comprisinga nucleotide sequence

-   -   a) as given in any one of the SEQ ID NOs of table 8 such as SEQ        ID NOs: 211-249 or a part thereof which still encodes a        partial-length polypeptide having substantially the same        activity as the full-length polypeptide, e.g., at least 50%,        more preferably at least 80%, even more preferably at least 90%        to 95% the activity of the full-length polypeptide;    -   b) having substantial similarity to (a);    -   c) capable of hybridizing to (a) or the complement thereof,    -   d) capable of hybridizing to a nucleic acid comprising 50 to 200        or more consecutive nucleotides of a nucleotide sequence as        given in any one of the SEQ ID NOs of table 8 such as SEQ ID        NOs: 211-249 or the complement thereof;    -   e) complementary to (a), (b) or (c); and    -   f) which is the reverse complement of (a), (b) or (c).

By providing the above subset of genes, the protein content andcomposition in the plant grain can be modified by up- or down-regulatingthe expression of at least one nucleic acid molecule within thissubgroup giving rise to altered levels or an altered composition of seedstorage protein in the plant grain.

For rice grains to be processed, it is advantageous that the proteincontent is small. In case of rice to be used for preparing fermentedalcoholic beverage, this can be attained through well defined refinementmeasures, thereby removing the proteins in the peripheral portion ofendosperm which contains large amounts of storage proteins. In producingrice starch, in order to promote the purity, proteins are removed bytreatments with alkalis, surfactants and ultrasonication.

The protein content in the rice grain also influences the taste of rice.Good tasting rice grains have usually low contents of proteins. Ricevarieties with a low protein content have been developed by theconventional cross-breeding or by mutation-breeding. (U.S. Pat. No.5,516,668; Maruta).

U.S. Pat. No. 5,516,668 describes a method for decreasing the amount ofglutelin in plant seeds, comprising introducing into a rice plant a genewhich is a template for the transcription of an antisense RNA againstrice glutelin; and transcribing said gene in seeds from said rice plantto inhibit translation of mRNA of glutelin, thereby decreasing theamount of glutelin in said seeds in comparison to the amount of glutelincontained in seeds from unmodified wild-type rice plants.

The cDNA of glutelin which is a seed storage protein in rice has beencloned and complete primary structure of the protein has been determinedby sequencing the cDNA. The gene of this protein has been isolated byusing the cDNA as a probe (Japanese Laid-open Patent Application (Kokai)No. 63-91085).

Rice plants with a low glutelin content in the rice grain can now beproduced more efficiently by down-regulating two or more of the theendogenous glutelin genes in rice seeds such as those provided in SEQ IDNOs: 223, 235, and 239 using methods known in the art includingantisense and dsRNAi techniques.

The invention thus also relates to a polynucleotide comprising anucleotide sequence encoding a glutelin protein the expression of whichis up-regulated during grain filling, which nucleic acid molecule issubstantially similar to a nucleic acid encoding a polypeptide as givenin SEQ ID NOs: 224, 236, and 240.

More specifically, the invention relates to a polynucleotide comprisinga nucleotide sequence encoding a glutelin protein the expression ofwhich is up-regulated during grain filling and which has at leastbetween 70%, and 99% amino acid sequence identity to at least onepolypeptide of SEQ ID NOs: 224, 236, and 240, with any individual numberwithin this range of between 70% and 99% also being part of theinvention.

The invention further relates to a polynucleotide comprising anucleotide sequence encoding a seed glutelin protein, the expression ofwhich is up-regulated during grain filling and which is immunologicallyreactive with antibodies raised against a polypeptide of SEQ ID NOs:224, 236, and 240.

More particularly, the invention relates to a polynucleotide comprisinga nucleotide sequence

-   -   a) as given in any one of SEQ ID NOs: 223,235, and 239 or a part        thereof which still encodes a partial length polypeptide having        substantially the same activity as the full-length polypeptide,        e.g., at least 50%, more preferably at least 80%, even more        preferably at least 90% to 95% the activity of the full-length        polypeptide;    -   b) having substantial similarity to (a);    -   c) capable of hybridizing to (a) or the complement thereof;    -   d) capable of hybridizing to a nucleic acid comprising 50 to 200        or more consecutive nucleotides of a nucleotide sequence given        in any one of SEQ ID NOs: 223, 235, and 239, or the complement        thereof;    -   e) complementary to (a), (b) or (c); and    -   f) which is the reverse complement of (a), (b) or (c).

Another class of seed storage proteins are the prolamins, which arenaturally rich in the essential amino acids lysine and methionine.Overexpressing said genes can thus increase the nutritional value offeeds and foods by producing said proteins at higher levels than thosefound in the unmodified wild-type plants. Another aspect of the presentinvention thus relates to providing genes that encode rice prolaminprotein such as those given in SEQ ID NOs: 217, 219, 225 and 241.

The invention thus also relates to a polynucleotide comprising anucleotide sequence encoding a prolamin protein the expression of whichis up-regulated during grain filling, which nucleotide sequence issubstantially similar to a nucleic acid sequence encoding a polypeptideas given in SEQ ID NOs: 218, 220, 226 and 242.

More specifically, the invention relates to a polynucleotide comprisinga nucleotide sequence encoding a prolamin protein, the expression ofwhich is up-regulated during grain filling and which has at leastbetween 70%, and 99% amino acid sequence identity to at least onepolypeptide of SEQ ID NOs: 218, 220, 226 and 242, with any individualnumber within this range of between 70% and 99% also being part of theinvention.

The invention further relates to a polynucleotide comprising anucleotide sequence encoding a prolamin protein, the expression of whichis up-regulated during grain filling and which is immunologicallyreactive with antibodies raised against a polypeptide of SEQ ID NOs:218, 220, 226 and 242.

More particularly, the invention relates to a polynucleotide comprisinga nucleotide sequence

-   -   a) as given in any one of SEQ ID NOs: 217, 219, 225 and 241 or a        part thereof which still encodes a partial-length polypeptide        having substantially the same activity as the full-length        polypeptide, e.g., at least 50%, more preferably at least 80%,        even more preferably at least 90% to 95% the activity of the        full-length polypeptide;    -   b) having substantial similarity to (a);    -   c) capable of hybridizing to (a) or the complement thereof;    -   d) capable of hybridizing to a nucleic acid comprising 50 to 200        or more consecutive nucleotides of a nucleotide sequence given        in any one of SEQ ID NOs: 217, 219, 225 and 241, or the        complement thereof;    -   e) complementary to (a), (b) or (c); and    -   f) which is the reverse complement of (a), (b) or (c).

Gliadins are a further group of seed storage proteins that are ofeconomic importance. Gliadin is a single-chained protein having anaverage molecular weight of about 30,000-40,000, with an isoelectric ofpH 4.0-5.0. Gliadin proteins are extremely sticky when hydrated and havelittle or no resistance to extension. Gliadin is responsible for givinggluten dough its characteristic cohesiveness. Gliadin is a premiumproducts, when available.

Gliadin is known to improve the freeze-thaw stability of frozen doughand also improves microwave stability. This product is also used as anall-natural chewing gum base replacer, a pharmaceutical binder, andimproves the texture and mouth feel of pasta products and has been foundto improve cosmetic products.

The invention provides a further subset of genes comprising a nucleotidesequence that encodes gliadin storage proteins. By overexpressing saidgenes in the plant, but preferably in the plant seed, the plant producesgrain with an increased concentration of gliadin as compared to theunmodified wild-type plant.

In a particular embodiment, the invention thus relates to apolynucleotide comprising a nucleotide sequence encoding a gliadinprotein, the expression of which is up-regulated during grain filling,which nucleotide sequence is substantially similar to a nucleic acidsequence encoding a polypeptide as given in SEQ ID NOs: 212, 219; 234,248; and 250.

More specifically, the invention relates to a polynucleotide comprisinga nucleotide sequence encoding a gliadin protein, the expression ofwhich is up-regulated during grain filling and which has at leastbetween 70%, and 99% amino acid sequence identity to at least onepolypeptide of SEQ ID NOs: 212, 219; 234, 248; and 250, with anyindividual number within this range of between 700/o and 99% also beingpart of the invention.

The invention further relates to a polynucleotide comprising anucleotide sequence encoding a seed gliadin protein, the expression ofwhich is up-regulated during grain filling and which is immunologicallyreactive with antibodies raised against a polypeptide of SEQ ID NOs:212, 219; 234, 248; and 250.

More particularly, the invention relates to a polynucleotide comprisinga nucleotide sequence

-   -   g) as given in any one of SEQ ID NOs: 211, 220; 233, 247; and        249 or a part thereof which still encodes a partial length        polypeptide having substantially the same activity as the        full-length polypeptide, e.g., at least 50%, more preferably at        least 80%, even more preferably at least 90% to 95% the activity        of the full-length polypeptide;    -   h) having substantial similarity to (a);    -   i) capable of hybridizing to (a) or the complement thereof;    -   j) capable of hybridizing to a nucleic acid comprising 50 to 200        or more consecutive nucleotides of a nucleotide sequence given        in any one of SEQ ID NOs: 211, 220; 233, 247; and 249, or the        complement thereof;    -   k) complementary to (a), (b) or (c); and    -   l) which is the reverse complement of (a), (b) or (c).

In a further embodiment the invention provides a subset of genes whichencode polypeptides that are involved in or associated with themetabolism of fatty acids in the rice grain.

Seed oil content has traditionally been modified by plant breeding. Theuse of recombinant DNA technology to alter seed oil composition canaccelerate this process and in some cases alter seed oils in a way thatcannot be accomplished by breeding alone. The oil composition ofBrassica has been significantly altered by modifying the expression of anumber of lipid metabolism genes. Such manipulations of seed oilcomposition have focused on altering the proportion of endogenouscomponent fatty acids. For example, antisense repression of the.DELTA.12-desaturase gene in transgenic rapeseed has resulted in anincrease in oleic acid of up to 83%. (Topfer et al. 1995 Science268:681-686).

There have been some successful attempts at modifying the composition ofseed oil in transgenic plants by introducing new genes that allow theproduction of a fatty acid that the host plants were not previouslycapable of synthesizing. Van de Loo, et al. (1995 Proc. Natl. Acad. SciUSA 92:6743-6747) have been able to introduce a .DELTA.12-hydroxylasegene into transgenic tobacco, resulting in the introduction of a novelfatty acid, ricinoleic acid, into its seed oil. The reportedaccumulation was modest from plants carrying constructs in whichtranscription of the hydroxylase gene was under the control of thecauliflower mosaic virus (CaMV) 35S promoter. Similarly, tobacco plantshave been engineered to produce low levels of petroselinic acid byexpression of an acyl-ACP desaturase from coriander (Cahoon et al. 1992Proc. Natl. Acad. Sci USA 89:11184-11188).

The long chain fatty acids (C18 and larger), have significant economicvalue both as nutritionally and medically important foods and asindustrial commodities (Ohlrogge, J. B. 1994 Plant Physiol.104:821-826). Linoleic (18:2.DELTA.9,12) and alpha.-linolenic acid (18:3.DELTA.9,12,15) are essential fatty acids found in many seed oils. Thelevels of these fatty-acids have been manipulated in oil seed cropsthrough breeding and biotechnology (Ohlrogge, et al. 1991 Biochim.Biophys. Acta 1082:1-26; Topfer et al. 1995 Science 268:681-686).Additionally, the production of novel fatty acids in seed oils can be ofconsiderable use in both human health and industrial applications.

Consumption of plant oils rich in .gamma.-linolenic acid (GLA)(18:3.DELTA.6,9,12) is thought to alleviate hypercholesterolemia andother related clinical disorders which correlate with susceptibility tocoronary heart disease (Brenner R. R. 1976 Adv. Exp. Med. Biol.83:85-101). The therapeutic benefits of dietary GLA may result from itsrole as a precursor to prostaglandin synthesis (Weete, J. D. 1980 inLipid Biochemistry of Fungi and Other Organisms, eds. Plenum Press, NewYork, pp. 59-62). Linoleic acid(18:2) (LA) is transformed into gammalinolenic acid (18:3) (GLA) by the enzyme .DELTA.6-desaturase.

Few seed oils contain GLA despite high contents of the precursorlinoleic acid. This is due to the absence of .DELTA.6-desaturaseactivity in most plants. For example, only borage (Borago officinalis),evening primrose (Oenothera biennis), and currants (Ribes nigrum)produce appreciable amounts of linolenic acid. Of these three species,only Oenothera and Borage are cultivated as a commercial source for GLA.It would be beneficial if agronomic seed oils could be engineered toproduce GLA in significant quantities by introducing a heterologous.DELTA.6-desaturase gene. It would also be beneficial if otherexpression products associated with fatty acid synthesis and lipidmetabolism could be produced in plants at high enough levels so thatcommercial production of a particular expression product becomesfeasible.

As disclosed in U.S. Pat. No. 5,552,306, a cyanobacterial.DELTA.sup.6-desaturase gene has been recently isolated. Expression ofthis cyanobacterial gene in transgenic tobacco resulted in significantbut low level GLA accumulation. (Reddy et al. 1996 Nature Biotech.14:639-642).

The present invention now provides a subset of genes encodingpolypeptides that are involved in or associated with fatty acidmetabolism, the expression of which is up-regulated during grainfilling.

In particular, the invention relates to a polynucleotide the expressionof which is up-regulated during grain filling comprising a nucleotidesequence encoding a polypeptide that is involved in or associated withfatty acid synthesis or lipid metabolism, which nucleotide sequence issubstantially similar to a nucleic acid sequence encoding a polypeptideas given in any one of the SEQ ID NOs of table 9 such as SEQ ID NOs:252-280.

More specifically, the invention relates to a polynucleotide theexpression of which is up-regulated during grain filling comprising anucleotide sequence encoding a polypeptide that is involved in orassociated with fatty acid synthesis or lipid metabolism and has atleast between 70%, and 99% amino acid sequence identity to at least onepolypeptide as given in any one of the SEQ ID NOs of table 9 such as SEQID NOs: 252-280, with any individual number within this range of between70% and 99% also being part of the invention.

The invention further relates to a polynucleotide the expression ofwhich is up-regulated during grain filling comprising a nucleotidesequence encoding a polypeptide that is involved in or associated withfatty acid synthesis or lipid metabolism and immunologically reactivewith antibodies raised against a polypeptide as given in any one of theSEQ ID NOs of table 9 such as SEQ ID NOs: 252-280.

More particularly, the invention relates to a polynucleotide comprisinga nucleotide sequence

-   -   a) as given in any one of the SEQ ID NOs of table 9 such as SEQ        ID NOs: 251-279 or a part thereof which still encodes a        partial-length polypeptide having substantially the same        activity as the full-length polypeptide, e.g., at least 50%,        more preferably at least 80%, even more preferably at least 90%        to 95% the activity of the full-length polypeptide;    -   b) having substantial similarity to (a);    -   c) capable of hybridizing to (a) or the complement thereof;    -   d) capable of hybridizing to a nucleic acid comprising 50 to 200        or more consecutive nucleotides of nucleotides as given in any        one of the SEQ ID NOs of table 9 such as SEQ ID NOs: 251-279 or        the complement thereof,    -   e) complementary to (a), (b) or (c); and    -   f) which is the reverse complement of (a), (b) or (c).

By providing this subset of genes it is now possible to modify the leveland composition of grain lipids by modulating the expression of thosegenes in the plant seed. Expression can be modulated either byintroducing at least one of the nucleic acid molecules from this subsetinto the plant, preferably under control of a seed specific promoter,and overexpressing said at least one to nucleic acid molecule in theplant seed, or, by down-regulating expression of the correspondingendogenous gene applying techniques know in the art including anti senseand dsRNAi techniques.

In a specific embodiment, the invention relates to a subset of genesencoding oleosins as represented by SEQ ID NOs: 257 and 259.

Oleosins are abundant seed proteins associated with the phospholipidmonolayer membrane of oil bodies, which are a means for storing lipidsin the plant cell. Analysis of the contents of lipid bodies hasdemonstrated that in addition to triglyceride and membrane lipids, thereare also several polypeptides/proteins associated with the surface orlumen of the oil body (Bowman-Vance and Huang, 1987, J. Biol. Chem.,262:11275-11279, Murphy et al., 1989, Biochem. J., 258:285-293, Tayloret al., 1990, Planta, 181:18-26). Oil-body proteins have been identifiedin a wide range of taxonomically diverse species (Moreau et al., 1980,Plant Physiol., 65:1176-1180; Qu et al., 1986, Biochem. J., 235:57-65)and have been shown to be uniquely localized in oil-bodies and not foundin organelles of vegetative tissues. In Brassica napus (rapeseed,canola) there are at least three polypeptides associated with theoil-bodies of developing seeds (Taylor et al., 1990, Planta, 181:18-26).

One of the most abundant proteins associated with the phospholipidmonolayer membrane of oil bodies are the oleosins. The first oleosingene, L3, was cloned from maize by selecting clones whose in vitrotranslated products were recognized by an anti-L3 antibody (Vance et al.1987 J. Biol. Chem. 262:11275-11279). Subsequently, different isoformsof oleosin genes from such different species as Brassica, soybean,carrot, pine, and Arabidopsis have been cloned (Huang, A. H. C., 1992,Ann. Reviews Plant Phys. and Plant Mol. Biol. 43:177-200; Kirik et al.,1996 Plant Mol. Biol. 31:413-417; Van Rooijen et al., 1992 Plant Mol.Biol. 18:1177-1179; Zou et al., Plant Mol. Biol. 31:429-433. Oleosinprotein sequences predicted from these genes are highly conserved,especially for the central hydrophobic domain. All of these oleosinshave the characteristic feature of three distinctive domains. Anamphipathic domain of 40-60 amino acids is present at the N-terminus; atotally hydrophobic domain of 68-74 amino acids is located at thecenter; and an amphipathic .alpha.-helical domain of 33-40 amino acidsis situated at the C-terminus (Huang, A. H. C. 1992).

A maize oleosin has been expressed in seed oil bodies in Brassica napustransformed with a Zea mays oleosin gene. The gene was expressed underthe control of regulatory elements from a Brassica gene encoding napin,a major seed storage protein. The temporal regulation and tissuespecificity of expression was reported to be correct for a napin genepromoter/terminator (Lee et al., 1991, Proc. Natl. Acad. Sci. U.S.A.,88:6181-6185).

By providing a subset of genes encoding oleosins, it is now possible tomodify the oleosin content in the phospholipid monolayer membrane of oilbodies by either introducing the genes provided herein into a plant andoverexpressing said gene in said plant or, in the alternative, bydown-regulating expression of the endogenous oleosin encoding genes inthe plant using method known in the art including anti-sense or dsRNAitechniques.

In one specific embodiment, the present invention thus relates to apolynucleotide comprising a nucleotide sequence encoding an oleosinprotein, which nucleotide sequence is substantially similar to a nucleicacid sequence encoding a polypeptide as given in SEQ ID NOs: 258 and260.

More specifically, the invention relates to a polynucleotide comprisinga nucleotide sequence encoding an oleosin protein, which is up-regulatedduring grain filling and has at least between 70%, and 99% amino acidsequence identity to at least one polypeptide of SEQ ID NOs: 258 and260, with any individual number within this range of between 70% and 99%also being part of the invention.

The invention further relates to a polynucleotide comprising anucleotide sequence encoding an oleosin protein, which is up-regulatedduring grain filling and immunologically reactive with antibodies raisedagainst a polypeptide of SEQ ID NOs: 258 and 260.

More particularly, the invention relates to a polynucleotide comprisinga nucleotide sequence

-   -   a) as given in any one of SEQ ID NOs: 257 and 259 or a part        thereof which still encodes a partial-length polypeptide having        substantially the same activity as the full-length polypeptide,        e.g., at least 50%, more preferably at least 80%, even more        preferably at least 90% to 95% the activity of the full-length        polypeptide;    -   b) having substantial similarity to (a);    -   c) capable of hybridizing to (a) or the complement thereof;    -   d) capable of hybridizing to a nucleic acid comprising 50 to 200        or more consecutive nucleotides of a nucleotide sequence given        in any one of SEQ ID NOs: 257 and 259, or the complement        thereof;    -   e) complementary to (a), (b) or (c); and    -   f) which is the reverse complement of (a), (b) or (c).

At least one of the genes provided herein, which is up-regulated duringgrain filling, encodes a phytoene dehydrogenase polypeptide that isinvolved in carotenoid biosynthesis and can thus be used to modifycaroteinoid production in grain.

Carotenoids are natural pigments that are essential to microbial, plant,and animal life. In photosynthetic organisms, they act as potentantioxidants that negate the lethal effects of singlet oxygen andsuperoxide formed during oxygen production. As human dietaryconstituents, these lipophilic antioxidants provide our cells withchemical protectants against the damaging effects of oxidation. Actingas chemical scavengers, carotenoids play roles in the prevention ofcancer and chronic maladies, including heart disease.

Phytoene (7,8,11,12,7′,8′,11′,12′-.omega. octahydro-.omega.,omega.-carotene) is the first carotenoid in the carotenoid biosynthesispathway and is produced by the dimerization of a 20-carbon atomprecursor, geranylgeranyl pyrophosphate (GGPP). Phytoene has usefulapplications in treating skin disorders (U.S. Pat. No. 4,642,318) and isitself a precursor for colored carotenoids. Aside from certain mutantorganisms, such as Phycomyces blakesleeanus carB, no current methods areavailable for producing phytoene via any biological process.

In some organisms, the red carotenoid lycopene(.omega.,.omega.-carotene) is the next carotenoid produced in thephytoene in the pathway. Lycopene imparts the characteristic red colorto ripe tomatoes.

Lycopene has utility as a food colorant. It is also an intermediate inthe biosynthesis of other carotenoids in some bacteria, fungi and greenplants.

Lycopene is prepared biosynthetically from phytoene through foursequential dehydrogenation reactions by the removal of eight atoms ofhydrogen. The enzymes that remove hydrogen from phytoene are phytoenedehydrogenases. One or more phytoene dehydrogenases can be used toconvert phytoene to lycopene and dehydrogenated derivatives of phytoeneintermediate to lycopene are also known. For example, some strains ofRhodobacter sphaeroides contain a phytoene dehydrogenase that removessix atoms of hydrogen from phytoene to produce neurosporene.

Lycopene is an intermediate in the biosynthesis of caaotenoids in somebacteria, fungi, and all green plants. Carotenoid-specific genes thatcan be used for synthesis of lycopene from the ubiquitous precursorfarnesyl pyrophosphate include those for the enzymes GGPP synthase,phytoene synthase, and phytoene dehydrogenase-4H.

In one specific embodiment the present invention relates to apolynucleotide comprising a nucleotide sequence encoding a polypeptidethe activity of which is involved in or associated with thedehydrogenation of phytoene and the expression of which is up-regulatedduring grain filling, which nucleotide sequence is substantially similarto a nucleic acid sequence encoding a polypeptide as given in SEQ ID NO:278.

More specifically, the invention relates to a polynucleotide comprisinga nucleotide sequence encoding a polypeptide the activity of which isinvolved in or associated with the dehydrogenation of phytoene and theexpression of which is up-regulated during grain filling and which hasat least between 70%, and 99/o amino acid sequence identity to at leastone polypeptide of SEQ ID NOs: 278, with any individual number withinthis range of between 70% and 99% also being part of the invention.

The invention further relates to a polynucleotide comprising anucleotide sequence encoding a polypeptide the activity of which isinvolved in or associated with the dehydrogenation of phytoene and theexpression of which is up-regulated during grain filling and which isimmunologically reactive with antibodies raised against a polypeptide ofSEQ ID NOs: 278.

More particularly, the invention relates to a polynucleotide comprisinga nucleotide sequence

-   -   a) as given in any one of SEQ ID NOs: 277 or a part thereof        which still encodes a partial-length polypeptide having        substantially the same activity as the full-length polypeptide,        e.g., at least 50%, more preferably at least 80%, even more        preferably at least 90% to 95% the activity of the full-length        polypeptide;    -   b) having substantial similarity to (a);    -   c) capable of hybridizing to (a) or the complement thereof;    -   d) capable of hybridizing to a nucleic acid comprising 50 to 200        or more consecutive nucleotides of a nucleotide sequence given        in any one of SEQ ID NOs: 277, or the complement thereof;    -   e) complementary to (a), (b) or (c); and    -   f) which is the reverse complement of (a), (b) or (c).

Another subset of genes that is provided as part of the inventioncomprises nucleic acid molecules that are involved in thetranscriptional control of the highly coordinated grain filling process.

Transcription factors are proteins that bind to the enhancer or promoterregions and interact such that transcription occurs from only a smallgroup of promoters in any cell. Most transcription factors can bind tospecific DNA sequences, and these trans-regulatory proteins can begrouped together in families based on similarities in structure. Withinsuch a family, proteins share a common framework structure in theirrespective DNA-binding sites, and slight differences in the amino acidsat the binding site can alter the sequence of the DNA to which it binds.In addition to having this sequence-specific DNA-binding domain,transcription factors contain a domain involved in activating thetranscription of the gene whose promoter or enhancer it has bound.Usually, this trans-activating domain enables that transcription factorto interact with proteins involved in binding RNA polymerase. Thisinteraction often enhances the efficiency with which the basaltranscriptional complex can be built and bind RNA polymerase E. Thereare several families of transcription factors, and those discussed hereare just some of the main types.

The gene subset provided herein includes a gene which encodes apolypeptide that is similar to the CREB-binding protein from Mus sp (asrepresented by SEQ ID NO: 301), and is highly expressed in aleurone andendosperm tissues during grain filling. CREB-binding protein (CBP) is anecessary component of the CREB/PKA paradigm of gene regulation. Theacetylation of histones and other proteins has been linked to generegulation, and CBP has a potent intrinsic acetyltransferase (AT)enzymatic domain. CREB belongs to a class of proteins whosephosphorylation appears specifically to enhance their trans-activationpotential (Arias J, et al Nature 1994 Jul. 21;370(6486):226-9).

CBP possesses intrinsic histone acetyltransferase activity, and canacetylate not only histones but also certain transcriptional factorssuch as GATA1; p53 and also myb-type transcription factors such as c-Myb(Yuji Sano and Shunsuke Ishii J. Biol. Chem., Vol. 276, Issue 5,3674-3682, Feb. 2, 2001). Acetylation of c-Myb by CBP increases thetrans-activating capacity of c-Myb by enhancing its association withCBP. These results demonstrate a novel molecular mechanism of regulationof c-Myb activity.

In rice, 70 known and putative MYB genes could be identified, some ofwhich show interesting expression patterns such as those given in SEQ IDNOs: 311-321. The expression pattern of these transcription factorssuggests that they play a key role during rice grain filling.

Another transcription factor gene (as represented by SEQ ID NOs: 305)included in this subset encodes a protein that has structural similarityto the yeast HAP5 transcriptional activator protein. In yeast, the HAP5protein is a component of the HAP (Hap2p-Hap3p-Hap4pHap5p)CCAAT-box-binding transcriptional activation complex and is essentialfor the binding activity of the complex.

A further transcription factor gene within this subset is represented bySEQ ID NO: 307 which encodes a bZIP-type transcription factor similar tothe plant G-box binding factor GBF4, that was found in Arabidopsis.GBF4, in a manner reminiscent of the Fos-related oncoproteins ofmammalian systems, cannot bind to DNA as a homodimer, although itcontains a basic region capable of specifically recognizing the G-boxand G-box-like elements. However, GBF4 can interact with GBF2 and GBF3to bind DNA as heterodimers. Mutagenesis of the leucine zipper of GBF4indicates that the mutation of a single amino acid confers upon theprotein the ability to recognize the G-box as a homodimer, apparently byaltering the charge distribution within the leucine zipper (A E Menkensand A R Cashmore (1994) PNAS 91: 2522-2526).

Another of the transcription factor genes within this subset encodes aprotein that has a zinc finger domain and is similar to a zinc-fingertype transcription factor found in Arabidopsis (gi|6899934).

Zinc finger proteins include WT-1 (a important transcription factorcritical in the formation of the kidney and gonads); the ubiquitoustranscription factor Sp1; Xenopus 5S rRNA transcription factor TFIIIA;Krox 20 (a protein that regulates gene expression in the developinghindbrain); Egr-1 (which commits white blood cell development to themacrophage lineage); Krippel (a protein that specifes abdominal cells inDrosophila); and numerous steroid-binding transcription factors. Each ofthese proteins has two or more “DNA-binding fingers,” a-helical domainswhose central amino acids tend to be basic. These domains are linkedtogether in tandem and are each stabilized by a centrally located zincion coordinated by two cysteines (at the base of the helix) and twointernal histidines. The crystal structure shows that the zinc fingersbind in the major groove of the DNA.

The expression pattern of these transcription factors during grainfilling suggests that they play a key role during rice graindevelopment. This is further supported by the fact that the AACApromoter element, which is known to be conserved in many seed storageprotein genes, is over-represented in the promoters of the grain fillingsub-set genes according to the invention. This subset comprises genesthe protein products of which are involved in diverse cellularfunctions, including carbohydrate, protein and fatty acid metabolism,nutrient transportation, and transcription and translation. The ACCApromoter element was thus demonstrated to be likely one of the keyelements in the coordination of different major pathways during graindevelopment.

In one embodiment the invention thus relates to a polynucleotidecomprising a nucleotide sequence that encodes a polypeptide that acts asa transcription factor and the expression of which is up-regulatesduring grain filling, which nucleotide sequence is substantially similarto a nucleic acid sequence encoding a polypeptide as given in any one ofthe SEQ ID NOs of table 11 such as SEQ ID NOs: 302-328.

More specifically, the invention relates to a polynucleotide comprisinga nucleotide sequence encodes a polypeptide that acts as a transcriptionfactor and the expression of which is up-regulated during grain fillingand which has at least between 700%, and 99% amino acid sequenceidentity to at least one polypeptide as given in any one of the SEQ IDNOs of table 11 such as SEQ ID NOs: 302-328, with any individual numberwithin this range of between 70% and 99% also being part of theinvention.

The invention further relates to a polynucleotide comprising anucleotide sequence encodes a polypeptide that acts as a transcriptionfactor and the expression of which is up-regulated during grain fillingand which is immunologically reactive with antibodies raised against apolypeptide as given in any one of the SEQ ID NOs of table 11 such asSEQ ID NOs: 302-328.

More particularly, the invention relates to a polynucleotide comprisinga nucleotide sequence

-   -   a) as given in any one of the SEQ ID NOs of table 11 such as SEQ        ID NOs: 301-327 or a part thereof which still encodes a        partial-length polypeptide having substantially the same        activity as the full-length polypeptide, e.g., at least 50%,        more preferably at least 80%, even more preferably at least 90%        to 95% the activity of the full-length polypeptide;    -   b) having substantial similarity to (a);    -   c) capable of hybridizing to (a) or the complement thereof;    -   d) capable of hybridizing to a nucleic acid comprising 50 to 200        or more consecutive nucleotides of a nucleotide sequence as        given in any one of the SEQ ID NOs of table 11 such as SEQ ID        NOs: 301-327, or the complement thereof;    -   e) complementary to (a), (b) or (c); and    -   f) which is the reverse complement of (a), (b) or (c).

By changing the expression level and/or pattern of at least onetranscription factor as provided herein, which is involved in theregulation and coordination of grain filling in plants, it is possibleto modify the grain filling process to obtain grain with a modifiednutritional composition and/or quality characteristics.

A further subset of genes which is provided herein comprises genesencoding polypeptides the activity of which is involved in or associatedwith amino acid metabolism.

In particular, the invention relates to a polynucleotide comprising anucleotide sequence encoding a polypeptide the activity of which isinvolved or associated with the metabolism of amino acids and theexpression of which is up-regulated during grain filling, whichnucleotide sequence is substantially similar to a nucleic acid sequenceencoding a polypeptide as given in any one of the SEQ ID NOs of table 10such as SEQ ID NOs: 282-300.

More specifically, the invention relates to a polynucleotide comprisinga nucleotide sequence encoding a polypeptide the activity of which isinvolved or associated with the metabolism of amino acids and theexpression of which is up-regulated during grain filling, whichpolypeptide has at least between 70%, and 99% amino acid sequenceidentity to at least one polypeptide as given in any one of the SEQ IDNOs of table 10 such as SEQ ID NOs: 282-300, with any individual numberwithin this range of between 70% and 99% also being part of theinvention.

The invention further relates to a polynucleotide comprising anucleotide sequence encoding a polypeptide the activity of which isinvolved or associated with the metabolism of amino acids and theexpression of which is up-regulated during grain filling, whichpolypeptide is immunologically reactive with antibodies raised against apolypeptide as given in any one of the SEQ ID NOs of table 10 such asSEQ ID NOs: 282-300.

More particularly, the invention relates to a polynucleotide comprisinga nucleotide sequence

-   -   a) as given in any one of the SEQ ID NOs of table 10 such as SEQ        ID NOs: 281-299 or a part thereof which still encodes a        partial-length polypeptide having substantially the same        activity as the full-length polypeptide, e.g., at least 50%,        more preferably at least 80%, even more preferably at least 90%        to 95% the activity of the full-length polypeptide;    -   b) having substantial similarity to (a);    -   c) capable of hybridizing to (a) or the complement thereof;    -   d) capable of hybridizing to a nucleic acid comprising 50 to 200        or more consecutive nucleotides of a nucleotide sequence as        given in any one of the SEQ ID NOs of table 10 such as SEQ ID        NOs: 281-299, or the complement thereof;    -   e) complementary to (a), (b) or (c); and

f) which is the reverse complement of (a), (b) or (c).

In a final embodiment, the present invention provides a subset of genesencoding polypeptides for which no biological function is known so far.It is within the scope of this invention, that the expression productsof these genes, respresentative examples of which are provided in columnB of table 3, can for the first time be associated with a biologicalfunction. Based on their mRNA expression characteristics and theirspecific expression pattern during grain filling it is suggested thatthey are involved in or associated with nutrient partitioning during thegrain filling process.

By modifying the expression of at least one of the genes within thissubgroup it is, therefore, possible to modify the compositionalcharacteristics and thus the nutritional properties of the plant grain.

The present invention provides a set of genes, which were shown to bepreferentially up-regulated and to share a similar expression patternduring the process of grain filling as specified hereinbefore. The geneswithin this subgroup are useful tools for generating plants whichproduce grain with modified compositional characteristics leading toimproved nutritional properties.

According to one embodiment, the present invention is directed to anucleic acid molecule comprising a nucleotide sequence isolated orobtained from any plant which encodes a polypeptide that has at least70% amino acid sequence identity to a polypeptide encoded by a genecomprising any one of SEQ ID NOs provided in the Sequence Listing.

Based on the Oryza nucleic acid sequences of the present invention asgiven in the SEQ ID NOs of the Sequence Listing, orthologs may beidentified or isolated from the genome of any desired organism,preferably from another plant, according to well known techniques basedon their sequence similarity to the Orya nucleic acid sequences, e.g.,hybridization, PCR or computer generated sequence comparisons. Forexample, all or a portion of a particular Oryza nucleic acid sequence isused as a probe that selectively hybridizes to other gene sequencespresent in a population of cloned genomic DNA fragments or cDNAfragments (i.e., genomic or cDNA libraries) from a chosen sourceorganism. Further, suitable genomic and cDNA libraries may be preparedfrom any cell or tissue of an organism. Such techniques includehybridization screening of plated DNA libraries (either plaques orcolonies; see, e.g., Sambrook et al., 1989) and amplification by PCRusing oligonucleotide primers preferably corresponding to sequencedomains conserved among related polypeptide or subsequences of thenucleotide sequences provided herein (see, e.g., Innis et al., 1990).These methods are particularly well suited to the isolation of genesequences from organisms closely related to the organism from which theprobe sequence is derived. The application of these methods using theOryza sequences as probes is well suited for the isolation of genesequences from any source organism, preferably other plant species. In aPCR approach, oligonucleotide primers can be designed for use in PCRreactions to amplify corresponding DNA sequences from cDNA or genomicDNA extracted from any plant of interest. Methods for designing PCRprimers and PCR cloning are generally known in the art.

In hybridization techniques, all or part of a known nucleotide sequenceis used as a probe that selectively hybridizes to other correspondingnucleotide sequences present in a population of cloned genomic DNAfragments or cDNA fragments (i.e., genomic or cDNA libraries) from achosen organism. The hybridization probes may be genomic DNA fragments,cDNA fragments, RNA fragments, or other oligonucleotides, and may belabeled with a detectable group such as ³²P, or any other detectablemarker. Thus, for example, probes for hybridization can be made by islabeling synthetic oligonucleotides based on the sequence of theinvention. Methods for preparation of probes for hybridization and forconstruction of cDNA and genomic libraries are generally known in theart and are disclosed in Sambrook et al. (1989). In general, sequencesthat hybridize to the sequences disclosed herein will have at least 40%to 50%, about 60% to 70% and even about 80% 85%, 90%, 95% to 98% or moreidentity with the disclosed sequences. That is, the sequence similarityof sequences may range, sharing at least about 40% to 50%, about 60% to70%, and even about 80%, 85%, 900/0, 95% to 98% sequence similarity,with each individual number within the ranges given above also beingpart of the invention.

The nucleic acid molecules of the invention can also be identified by,for example, a search of known databases for genes encoding polypeptideshaving a specified amino acid sequence identity or DNA having aspecified nucleotide sequence identity. Methods of alignment ofsequences for comparison are well known in the art and are describedhereinabove.

In a further embodiment, the invention provides isolated nucleic acidmolecules comprising a plant nucleotide sequence that inducestranscription of a linked nucleic acid segment in a plant or plant cell,e.g., a linked nucleic acid molecule comprising an open reading framefor or encoding a structural or regulatory gene, in a tissue specific ortissue preferential manner.

In a specific embodiment, the invention provides isolated nucleic acidmolecules comprising a plant nucleotide sequence that inducestranscription of a linked nucleic acid segment in a plant or plant cell,e.g., a linked nucleic acid molecule comprising an open reading framefor or encoding a structural or regulatory gene, in a seed-specific orseed-preferential manner. In particular, the plant nucleotide sequenceaccording to the invention is substantially less active in vegetativetissue as compared to seed and is most active in the endosperm. Thetranscription inducing activity icreases during seed development andreaches its peak at or around the time of grain filling.

In particular, the nucleotide sequence of the invention directs seeds-(e.g. endosperm) specific or seeds- (e.g. endosperm) preferentialtranscription of a linked nucleic acid segment in a plant or plant celland is preferably obtained or obtainable from plant genomic DNA having agene comprising an open reading frame (ORF) encoding a polypeptide whichis substantially similar, and preferably has at least 70%, e.g., 71%,72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%,86%, 87%, 88%, 89%, and even 90% or more, e.g., 91%, 92%, 93%, 94%, 95%,96%, 97%, 98%, and 99%, amino acid sequence identity, to a polypeptideencoded by an Oryza, e.g., Oryza sativa, gene comprising any one of SEQID NOs: 2-462 (e.g., including a promoter obtained or obtainable fromany one of SEQ ID NOs: 643-883) which directs seed-specific (orseed-preferential) transcription of a linked nucleic acid segment.

The promoters of the invention include a consecutive stretch of about 25to 2000, including 50 to 500 or 100 to 250, and up to 1000 or 1500,contiguous nucleotides, e.g., 40 to about 750, 60 to about 750, 125 toabout 750, 250 to about 750, 400 to about 750, 600 to about 750, of anyone of SEQ ID NOs: 643-883, or the promoter orthologs thereof, whichinclude the minimal promoter region.

In a particular embodiment of the invention said consecutive stretch ofabout 25 to 2000, including 50 to 500 or 100 to 250, and up to 1000 or1500, contiguous nucleotides, e.g., 40 to about 750, 60 to about 750,125 to about 750, 250 to about 750, 400 to about 750, 600 to about 750,has at least 75%, preferably 80%, more preferably 90% and mostpreferably 95%, nucleic acid sequence identity with a correspondingconsecutive stretch of about 25 to 2000, including 50 to 500 or 100 to250, and up to 1000 or 1500, contiguous nucleotides, e.g., 40 to about750, 60 to about 750, 125 to about 750, 250 to about 750, 400 to about750, 600 to about 750, of any one of SEQ ID NOs: 643-883 or the promoterorthologs thereof, which include the minimal promoter region. The abovedefined stretch of contiguous nucleotides preferably comprises one ormore promoter motifs, e.g., for seed-specific promoters, motifs selectedfrom the group consisting of the P box and GCNA elements, including butnot limited to TGTAAAG and TGA(G/C)TCA and a transcription start site.

In case of promoters directing tissue-specific transcription of a linkednucleic acid segment in a plant or plant cell such as, for example, apromoter directing seed-specific or seed-preferential, but especiallyendosperm-specific or endosperm-preferential transcription, it isfurther preferred that previously defined stretch of contiguousnucleotides comprises further motifs that participate in the tissuespecificity of said stretch(es) of nucleotides.

Generally, the promoters of the invention may be employed to express anucleic acid segment that is operably linked to said promoter such as,for example, an open reading frame, or a portion is thereof, ananti-sense sequence, or a transgene in plants. The open reading framemay be obtained from an insect resistance gene, a disease resistancegene such as, for example, a bacterial disease resistance gene, a fungaldisease resistance gene, a viral disease resistance gene, a nematodedisease resistance gene, a herbicide resistance gene, a gene affectinggrain composition or quality, a nutrient utilization gene, a mycotoxinreduction gene, a male sterility gene, a selectable marker gene, ascreenable marker gene, a negative selectable marker, a positiveselectable marker, a gene affecting plant agronomic characteristics,i.e., yield, standability, and the like, or an environment or resistancegene, i.e., one or more genes that confer herbicide resistance ortolerance, insect resistance or tolerance, disease resistance ortolerance (viral, bacterial, fungal, oomycete, or nematode), stresstolerance or resistance (as exemplified by resistance or tolerance todrought, heat, chilling, fleezing, excessive moisture, salt stress, oroxidative stress), increased yields, food content and makeup, physicalappearance, male sterility, drydown, standability, prolificacy, starchproperties or quantity, oil quantity and quality, amino acid or proteincomposition, and the like. By “resistant” is meant a plant whichexhibits substantially no phenotypic changes as a consequence of agentadministration, infection with a pathogen, or exposure to stress. By“tolerant” is meant a plant which, although it may exhibit somephenotypic changes as a consequence of infection, does not have asubstantially decreased reproductive capacity or substantially alteredmetabolism.

For instance, seed-specific promoters may be useful for expressing genesas well as for producing large quantities of protein, for expressingoils or proteins of interest, e.g., antibodies, genes for increasing thenutritional value of the seed and the like. In particular, theseed-specific or seed-preferential promoters accroding to the inventionsuch as those provided in SEQ ID NOs: 643-883 may be useful forexpressing the Open Reading Frames which are represented by thenucleotide sequences of SEQ ID NOs: 1-461 and 501-511, respectively.

Obtaining sufficient levels of transgene expression in the appropriateplant tissues is an important aspect in the production of geneticallyengineered crops. Expression of heterologous DNA sequences in a planthost is dependent upon the presence of an operably linked promoter thatis functional within the plant host. Choice of the promoter sequencewill determine when and where within the organism the heterologous DNAsequence is expressed.

It is specifically contemplated by the present invention that one coulduse any one of the promoters according to the present invention inunaltered or altered form. Mutagenization of a promoter of the presentinvention such as those provided in SEQ ID NOs: 643-883 may potentiallyimprove the utility of the elements for the expression of transgenes inplants. The mutagenesis of these elements can be carried out at randomand the mutagenized promoter sequences screened for activity in atrial-by-error procedure.

Alternatively, particular sequences which provide the promoter withdesirable expression characteristics, or the promoter with expressionenhancement activity, could be identified and these or similar sequencesintroduced into the sequences via mutation. It is further contemplatedthat one could mutagenize these sequences in order to enhance theirexpression of transgenes in a particular species.

The means for mutagenizing a DNA segment encoding a promoter sequence ofthe current invention are well-known to those of skill in the art. Asindicated, modifications to promoter or other regulatory element may bemade by random, or site-specific mutagenesis procedures. The promoterand other regulatory element may be modified by altering their structurethrough the addition or deletion of one or more nucleotides from thesequence which encodes the corresponding un-modified sequences.

Mutagenesis may be performed in accordance with any of the techniquesknown in the art, such as, and not limited to, synthesizing anoligonucleotide having one or more mutations within the sequence of aparticular regulatory region. In particular, site-specific mutagenesisis a technique useful in the preparation of promoter mutants, throughspecific mutagenesis of the underlying DNA. The technique furtherprovides a ready ability to prepare and test sequence variants, forexample, incorporating one or more of the foregoing considerations, byintroducing one or more nucleotide sequence changes into the DNA.Site-specific mutagenesis allows the production of mutants through theuse of specific oligonucleotide sequences which encode the DNA sequenceof the desired mutation, as well as a sufficient number of adjacentnucleotides, to provide a primer sequence of sufficient size andsequence complexity to form a stable duplex on both sides of thedeletion junction being traversed. Typically, a primer of about 17 toabout 75 nucleotides or more in length is preferred, with about 10 toabout 25 or more residues on both sides of the junction of the sequencebeing altered.

In general, the technique of site-specific mutagenesis is well known inthe art, as exemplified by various publications. As will be appreciated,the technique typically employs a phage vector which exists in both asingle stranded and double stranded form. Typical vectors useful insite-directed mutagenesis include vectors such as the M13 phage. Thesephage are readily commercially available and their use is generally wellknown to those skilled in the art.

Double stranded plasmids also are routinely employed in site directedmutagenesis which eliminates the step of transferring the gene ofinterest from a plasmid to a phage.

In general, site-directed mutagenesis in accordance herewith isperformed by first obtaining a single-stranded vector or melting apartof two strands of a double stranded vector which includes within itssequence a DNA sequence which encodes the promoter. An oligonucleotideprimer bearing the desired mutated sequence is prepared, generallysynthetically. This primer is then annealed with the single-strandedvector, and subjected to DNA polymerizing enzymes such as E. colipolymerase I Klenow fragment, in order to complete the synthesis of themutation-bearing strand. Thus, a heteroduplex is formed wherein onestrand encodes the original non-mutated sequence and the second strandbears the desired mutation.

This heteroduplex vector is then used to transform or transfectappropriate cells, such as E. coli cells, and cells are selected whichinclude recombinant vectors bearing the mutated sequence arrangement.Vector DNA can then be isolated from these cells and used for planttransformation. A genetic selection scheme is devised by Kunkel et al.(1987) to enrich for clones incorporating mutagenic oligonucleotides.Alternatively, the use of PCR with commercially available thermostableenzymes such as Taq polymerase may be used to incorporate a mutagenicoligonucleotide primer into an amplified DNA fragment that can then becloned into an appropriate cloning or expression vector. ThePCR-mediated mutagenesis procedures of Tomic et al. (1990) and Upenderet al. (1995) provide two examples of such protocols. A PCR employing athermostable ligase in addition to a thermostable polymerase also may beused to incorporate a phosphorylated mutagenic oligonucleotide into anamplified DNA fragment that may then be cloned into an appropriatecloning or expression vector. The mutagenesis procedure described byMichael (1994) provides an example of one such protocol.

The preparation of sequence variants of the selected promoter-encodingDNA segments using site-directed mutagenesis is provided as a means ofproducing potentially useful species and is not meant to be limiting asthere are other ways in which sequence variants of DNA sequences may beobtained. For example, recombinant vectors encoding the desired promotersequence may be treated with mutagenic agents, such as hydroxylamine, toobtain sequence variants.

As used herein, the term “oligonucleotide directed mutagenesisprocedure” refers to template-dependent processes and vector-mediatedpropagation which result in an increase in the concentration of aspecific nucleic acid molecule relative to its initial concentration, orin an increase in the concentration of a detectable signal, such asamplification. As used herein, the term “oligonucleotide directedmutagenesis procedure” also is intended to refer to a process thatinvolves the template-dependent extension of a primer molecule. The termtemplate-dependent process refers to nucleic acid synthesis of an RNA ora DNA molecule wherein the sequence of the newly synthesized strand ofnucleic acid is dictated by the well-known rules of complementary basepairing (see, for example, Watson and Ramstad, 1987). Typically, vectormediated methodologies involve the introduction of the nucleic acidfragment into a DNA or RNA vector, the clonal amplification of thevector, and the recovery of the amplified nucleic acid fragment.Examples of such methodologies are provided by U.S. Pat. No. 4,237,224.A number of template dependent processes are available to amplify thetarget sequences of interest present in a sample, such methods beingwell known in the art and specifically disclosed herein below.

Where a clone comprising a promoter has been isolated in accordance withthe instant invention, one may wish to delimit the essential promoterregions within the clone. One efficient, targeted means for preparingmutagenizing promoters relies upon the identification of putativeregulatory elements within the promoter sequence. This can be initiatedby comparison with promoter sequences known to be expressed in similartissue-specific or developmentally unique manner. Sequences which areshared among promoters with similar expression patterns are likelycandidates for the binding of transcription factors and are thus likelyelements which confer expression patterns. Confirmation of theseputative regulatory elements can be achieved by deletion analysis ofeach putative regulatory region followed by functional analysis of eachdeletion construct by assay of a reporter gene which is functionallyattached to each construct. As such, once a starting promoter sequenceis provided, any of a number of different deletion mutants of thestarting promoter could be readily prepared.

As indicated above, deletion mutants, deletion mutants of the promoterof the invention also could be randomly prepared and then assayed. Withthis strategy, a series of constructs are prepared, each containing adifferent portion of the clone (a subclone), and these constructs arethen screened for activity. A suitable means for screening for activityis to attach a deleted promoter or intron construct which contains adeleted segment to a selectable or screenable marker, and to isolateonly those cells expressing the marker gene. In this way, a number ofdifferent, deleted promoter constructs are identified which still retainthe desired, or even enhanced, activity. The smallest segment which isrequired for activity is thereby identified through comparison of theselected constructs. This segment may then be used for the constructionof vectors for the expression of exogenous genes.

Furthermore, it is contemplated that promoters combining elements frommore than one promoter may be useful. For example, U.S. Pat. No.5,491,288 discloses combining a Cauliflower Mosaic Virus promoter with ahistone promoter. Thus, the elements from the promoters disclosed hereinmay be combined with elements from other promoters.

The present invention further provides a composition, an expressioncassette or a recombinant vector containing the nucleic acid molecule ofthe invention as discosed herinbefore, and host cells comprising theexpression cassette or vector, e.g., comprising a plasmid.

In particular, the present invention provides an expression cassette ora recombinant vector comprising a suitable promoter linked to a nucleicacid segment of the invention, representative examples of which areprovided in the SEQ ID NOs of the Sequence Listing, which, when presentin a plant, plant cell or plant tissue, results in transcription of thelinked nucleic acid segment.

Promoters which are useful for plant transgene expression include thosethat are inducible, viral, synthetic, constitutive (Odell et al., 1985),temporally regulated, spatially regulated, tissue-specific, andspatio-temporally regulated.

Where expression in specific tissues or organs is desired,tissue-specific promoters may be used. In contrast, where geneexpression in response to a stimulus is desired, inducible promoters arethe regulatory elements of choice. Where continuous expression isdesired throughout the cells of a plant, constitutive promoters areutilized. Additional regulatory sequences upstream and/or downstreamfrom the core promoter sequence may be included in expression constructsof transformation vectors to bring about varying levels of expression ofheterologous nucleotide sequences in a transgenic plant.

Suitable promoter and/or regulatory sequences further include those thatare preferentially or specifically active in plant grain tissue such as,for example, the grain endosperm or the grain embryo.

Further, the invention provides isolated polypeptides encoded by any oneof the open reading frames of the invention, representative examples ofwhich are provided in the SEQ ID NOs of the Sequence Listing, or afragment thereof, which encodes a polypeptide which has substantiallythe same activity as the corresponding polypeptide encoded by an ORFgiven in the SEQ ID NOs of the Sequence Listing, or the orthologsthereof.

Virtually any DNA composition may be used for delivery to recipientplant cells, e.g., monocotyledonous cells, to ultimately produce fertiletransgenic plants in accordance with the present invention. For example,DNA segments or fragments in the form of vectors and plasmids, or linearDNA segments or fragments, in some instances containing only the DNAelement to be expressed in the plant, and the like, may be employed. Theconstruction of vectors which may be employed in conjunction with thepresent invention will be known to those of skill of the art in light ofthe present disclosure (see, e.g., Sambrook et al., 1989; Gelvin et al.,1990).

It is one of the objects of the present invention to provide recombinantDNA molecules comprising a nucleotide sequence which directstranscription according to the invention operably linked to a nucleicacid segment or sequence of interest.

The nucleic acid segment of interest can, for example, code for aribosomal RNA, an antisense RNA or any other type of RNA that is nottranslated into protein. In another preferred embodiment of theinvention, the nucleic acid segment of interest is translated into aprotein product. The nucleotide sequence which directs transcriptionand/or the nucleic acid segment may be of homologous or heterologousorigin with respect to the plant to be transformed. A recombinant DNAmolecule useful for introduction into plant cells includes that whichhas been derived or isolated from any source, that may be subsequentlycharacterized as to structure, size and/or function, chemically altered,and later introduced into plants. An example of a nucleotide sequence orsegment of interest “derived” from a source, would be a nucleotidesequence or segment that is identified as a useful fragment within agiven organism, and which is then chemically synthesized in essentiallypure form. An example of such a nucleotide sequence or segment ofinterest “isolated” from a source, would be nucleotide sequence orsegment that is excised or removed from said source by chemical means,e.g., by the use of restriction endonucleases, so that it can be furthermanipulated, e.g., amplified, for use in the invention, by themethodology of genetic engineering. Such a nucleotide sequence orsegment is commonly referred to as “recombinant.”

Therefore a useful nucleotide sequence, segment or fragment of interestincludes completely synthetic DNA, semi-synthetic DNA, DNA isolated frombiological sources, and DNA derived from introduced RNA. Generally, theintroduced DNA is not originally resident in the plant genotype which isthe recipient of the DNA, but it is within the scope of the invention toisolate a gene from a given plant genotype, and to subsequentlyintroduce multiple copies of the gene into the same genotype, e.g., toenhance production of a given gene product such as a storage protein ora protein that is involved in carbohydrate metabolism or any other geneof interest as provided in the SEQ ID NOs of the sequence listing.

The introduced recombinant DNA molecule includes but is not limited to,DNA from plant genes, and non-plant genes such as those from bacteria,yeasts, animals or viruses. The introduced DNA can include modifiedgenes, portions of genes, or chimeric genes, including genes from thesame or different genotype. The term “chimeric gene” or “chimeric DNA”is defined as a gene or DNA sequence or segment comprising at least twoDNA sequences or segments from species which do not combine DNA undernatural conditions, or which DNA sequences or segments are positioned orlinked in a manner which does not normally occur in the native genome ofuntransformed plant.

The introduced recombinant DNA molecule used for transformation hereinmay be circular or linear, double-stranded or single-stranded.Generally, the DNA is in the form of chimeric DNA, such as plasmid DNA,that can also contain coding regions flanked by regulatory sequenceswhich promote the expression of the recombinant DNA present in theresultant plant.

Generally, the introduced recombinant DNA molecule will be relativelysmall, i.e., less than about 30 kb to minimize any susceptibility tophysical, chemical, or enzymatic degradation which is known to increaseas the size of the nucleotide molecule increases. As noted above, thenumber of proteins, RNA transcripts or mixtures thereof which isintroduced into the plant genome is preferably preselected and defined,e.g., from one to about 5-10 such products of the introduced DNA may beformed.

This expression cassette or vector may be contained in a host cell. Theexpression cassette or vector may augment the genome of a transformedplant or may be maintained extrachromosomally. The expression cassettemay be operatively linked to a structural gene, the open reading framethereof, or a portion thereof. The expression cassette may furthercomprise a Ti plasmid and be contained in an Agrobacterium tumefacienscell; it may be carried on a microparticle, wherein the microparticle issuitable for ballistic transformation of a plant cell; or it may becontained in a plant cell or protoplast. Further, the expressioncassette or vector can be contained in a transformed plant or cellsthereof, and the plant may be a dicot or a monocot. In particular, theplant may be a cereal plant.

Obtaining sufficient levels of transgene expression in the appropriateplant tissues is an important aspect in the production of geneticallyengineered crops. Expression of heterologous DNA sequences in a planthost is dependent upon the presence of an operably linked promoter thatis functional within the plant host. Choice of the promoter sequencewill determine when and where within the organism the heterologous DNAsequence is expressed.

For example, for overexpression, a plant promoter fragment may beemployed which will direct expression of the gene in all tissue; of aregenerated plant. Such promoters are referred to herein as“constitutive” promoters and are active under most environmentalconditions and states of development or cell differentiation. Examplesof constitutive promoters include the cauliflower mosaic virus (CaMV)³⁵S transcription initiation region, the 1′- or 2′-promoter derived fromT-DNA of Agrobacterium tumafaciens, and other transcription initiationregions from various plant genes known to those of skill. Such genesinclude for example, the AP2 gene, ACT11 from Arabidopsis (Huang et al.Plant Mol. Biol. 33:125-139 (1996)), Cat3 from Arabidopsis (GenBank No.U43147, Zhong et al., Mol. Gen. Genet. 251:196-203 (1996)), the geneencoding stearoyl-acyl carrier protein desaturase from Brassica napus(Genbank No. X74782, Solocombe et al. Plant Physiol. 104:1167-1176(1994)), GPc1 from maize (GenBank No. X15596, Martinez et al. J. Mol.Biol. 208:551-565 (1989)), and Gpc2 from maize (GenBank No. U45855,Manjunath et al., Plant Mol. Biol. 33:97-112 (1997)).

Alternatively, the plant promoter may direct expression of the nucleicacid molecules of the invention in a specific tissue or may be otherwiseunder more precise environmental or developmental control. Examples ofenvironmental conditions that may effect transcription by induciblepromoters include anaerobic conditions, elevated temperature, or thepresence of light. Such promoters are referred to here as “inducible” or“tissue-specific” promoters. One of skill will recognize that atissue-specific promoter may drive expression of operably linkedsequences in tissues other than the target tissue. Thus, as used hereina tissue-specific promoter is one that drives expression preferentiallyin the target tissue, but may also lead to some expression in othertissues as well.

Examples of promoters under developmental control include promoters thatinitiate transcription only (or primarily only) in certain tissues, suchas fruit, seeds, or flowers. Promoters that direct expression of nucleicacids in ovules, flowers or seeds are particularly useful in the presentinvention. As used herein a seed-specific or preferential promoter isone which directs expression specifically or preferentially in seedtissues, such promoters may be, for example, ovule-specific,embryo-specific, endosperm-specific, integument-specific, seedcoat-specific, or some combination thereof. Examples include a promoterfrom the ovule-specific BEL1 gene described in Reiser et al. Cell83:735-742 (1995) (GenBank No. U39944). Other suitable seed specificpromoters are derived from the following genes: MAC1 from maize(Sheridan et al. Genetics 142:1009-1020 (1996), Cat3 from maize (GenBankNo. L05934, Abler et al., Plant Mol. Biol. 22:10131-1038 (1993), thegene encoding oleosin 18 kD from maize (GenBank No, J05212, Lee et al.,Plant Mol. Biol. 26:1981-1987 (1994)), vivparous-1 from Arabidopsis(Genbank No. U93215), the gene encoding oleosin from Arabidopsis(Genbank No. Z17657), Atmycl from Arabidopsis (Urao et al., Plant Mol.Biol. 32:571-576 (1996), the 2 s seed storage protein gene family fromArabidopsis (Conceicao et al. Plant 5:493-505 (1994)) the gene encodingoleosin 20 kD from Brassica napus (GenBank No. M63985), napA fromBrassica napus (GenBank No. J02798, Josefsson et al. JBL 26:12196-1301(1987), the napin gene family from Brassica napus (Sjodahl et al. Planta197:264-271 (1995), the gene encoding the 2 S storage protein fromBrassica napus (Dasgupta et al. Gene 133:301-302 (1993)), the genesencoding oleosin A (Genbank No. U09118) and oleosin B (Genbank No.U09119) from soybean and the gene encoding low molecular weight sulphurrich protein from soybean (Choi et al. Mol Gen, Genet. 246:266-268(1995)).

It is specifically contemplated that one could use one of the promotersthat are disclosed in co-pending provisional U.S. application Ser. No.60/325,448, filed Sep. 26, 2001 in unaltered or altered form. Especiallypreferred are promoters that direct transcription of an associatednucleic acid molecule specifically or preferentially in tissues of theplant grain such as those provided in SEQ ID NOs: 2275-2672.

Mutagenization of a promoter such as those mentioned hereinbefore orthose provided in provisional U.S. application Ser. No. 60/325,448 maypotentially improve the utility of the elements for the expression oftransgenes in plants. The mutagenesis of these elements can be carriedout at random and the mutagenized promoter sequences screened foractivity in a trial-by-error procedure.

Alternatively, particular sequences which provide the promoter withdesirable expression characteristics, or the promoter with expressionenhancement activity, could be identified and these or similar sequencesintroduced into the sequences via mutation. It is further contemplatedthat one could mutagenize these sequences in order to enhance theirexpression of transgenes in a particular species.

Furthermore, it is contemplated that promoters combining elements frommore than one promoter may be useful. For example, U.S. Pat. No.5,491,288 discloses combining a Cauliflower Mosaic Virus promoter with ahistone promoter. Thus, the elements from the promoters disclosed hereinmay be combined with elements from other promoters.

A variety of 5N and 3N transcriptional regulatory sequences areavailable for use in the present invention. Transcriptional terminatorsare responsible for the termination of transcription and correct mRNApolyadenylation. The 3N nontranslated regulatory DNA sequence preferablyincludes from about 50 to about 1,000, more preferably about 100 toabout 1,000, nucleotide base pairs and contains plant transcriptionaland translational termination sequences. Appropriate transcriptionalterminators and those which are known to function in plants include theCaMV 35S terminator, the tml terminator, the nopaline synthaseterminator, the pea rbcS E9 terminator, the terminator for the T7transcript from the octopine synthase gene of Agrobacterium tumefaciens,and the 3N end of the protease inhibitor 1 or 11 genes from potato ortomato, although other 3N elements known to those of skill in the artcan also be employed. Alternatively, one also could use a gamma coixin,oleosin 3 or other terminator from the genus Coix.

Preferred 3N elements include those from the nopaline synthase gene ofAgrobacterium tumefaciens (Bevan et al., 1983), the terminator for theT7 transcript from the octopine synthase gene of Agrobacteriumtumefaciens, and the 3′ end of the protease inhibitor 1 or 11 genes frompotato or tomato.

As the DNA sequence between the transcription initiation site and thestart of the coding sequence, i.e., the untranslated leader sequence,can influence gene expression, one may also wish to employ a particularleader sequence. Preferred leader sequences are contemplated to includethose which include sequences predicted to direct optimum expression ofthe attached gene, i.e., to include a preferred consensus leadersequence which may increase or maintain mRNA stability and preventinappropriate initiation of translation. The choice of such sequenceswill be known to those of skill in the art in light of the presentdisclosure. Sequences that are derived from genes that are highlyexpressed in plants will be most preferred.

Other sequences that have been found to enhance gene expression intransgenic plants include intron sequences (e.g., from Adh1, bronze1,actin1, actin 2 (WO 00/760067), or the sucrose synthase intron) andviral leader sequences (e.g., from TMV, MCMV and AMV). For example, anumber of non-translated leader sequences derived from viruses are knownto enhance expression. Specifically, leader sequences from TobaccoMosaic Virus (TMV), Maize Chlorotic Mottle Virus (MCMV), and AlfalfaMosaic Virus (AMV) have been shown to be effective in enhancingexpression (e.g., Gallie et al., 1987; Skuzeski et al., 1990). Otherleaders known in the art include but are not limited to: Picornavirusleaders, for example, EMCV leader (Encephalomyocarditis 5 noncodingregion) (Elroy-Stein et al., 1989); Potyvirus leaders, for example, TEVleader (Tobacco Etch Virus); MDMV leader (Maize Dwarf Mosaic Virus);Human immunoglobulin heavy-chain binding protein (BiP) leader, (Macejaket al., 1991); Untranslated leader from the coat protein mRNA of alfalfamosaic virus (AMV RNA 4), (Jobling et al., 1987; Tobacco mosaic virusleader (TMV), (Gallie et al., 1989; and Maize Chlorotic Mottle Virusleader (MCMV) (Lommel et al., 1991. See also, Della-Cioppa et al., 1987.

Regulatory elements such as Adh intron 1 (Callis et al., 1987), sucrosesynthase intron (Vasil et al., 1989) or TMV omega element (Gallie, etal., 1989), may further be included where desired.

Examples of enhancers include elements from the CaMV 35S promoter,octopine synthase genes (Ellis et al., 1987), the rice actin I gene, themaize alcohol dehydrogenase gene (Callis et al., 1987), the maizeshrunken I gene (Vasil et al., 1989), TMV omega element (Gallie et al.,1989) and promoters from non-plant eukaryotes (e.g. yeast; Ma et al.,1988).

Two principal methods for the control of expression are known, viz.:overexpression and underexpression. Overexpression can be achieved byinsertion of one or more than one extra copy of the selected gene. Itis, however, not unknown for plants or their progeny, originallytransformed with one or more than one extra copy of a nucleotidesequence, to exhibit the effects of underexpression as well asoverexpression. For underexpression there are two principle methodswhich are commonly referred to in the art as “antisense downregulation”and “sense downregulation” (sense downregulation is also referred to as“cosuppression”). Generically these processes are referred to as “genesilencing”. Both of these methods lead to an inhibition of expression ofthe target gene.

Within the scope of the present invention, the alteration in expressionof the nucleic acid molecule of the present invention may be achieved inone of the following ways:

(1) “Sense” Suppression

Alteration of the expression of a nucleotide sequence of the presentinvention, preferably reduction of its expression, is obtained by“sense” suppression (referenced in e.g. Jorgensen et al. (1996) PlantMol. Biol. 31, 957-973). In this case, the entirety or a portion of anucleotide sequence of the present invention is comprised in a DNAmolecule. The DNA molecule is preferably operatively linked to apromoter functional in a cell comprising the target gene, preferably aplant cell, and introduced into the cell, in which the nucleotidesequence is expressible. The nucleotide sequence is inserted in the DNAmolecule in the “sense orientation”, meaning that the coding strand ofthe nucleotide sequence can be transcribed. In a preferred embodiment,the nucleotide sequence is fully translatable and all the geneticinformation comprised in the nucleotide sequence, or portion thereof, istranslated into a polypeptide. In another preferred embodiment, thenucleotide sequence is partially translatable and a short peptide istranslated. In a preferred embodiment, this is achieved by inserting atleast one premature stop codon in the nucleotide sequence, which bringtranslation to a halt. In another more preferred embodiment, thenucleotide sequence is transcribed but no translation product is beingmade. This is usually achieved by removing the start codon, e.g. the“ATG”, of the polypeptide encoded by the nucleotide sequence. In afurther preferred embodiment, the DNA molecule comprising the nucleotidesequence, or a portion thereof, is stably integrated in the genome ofthe plant cell. In another preferred embodiment, the DNA moleculecomprising the nucleotide sequence, or a portion thereof, is comprisedin an extrachromosomally replicating molecule. In transgenic plantscontaining one of the DNA molecules described immediately above, theexpression of the nucleotide sequence corresponding to the nucleotidesequence comprised in the DNA molecule is preferably reduced.Preferably, the nucleotide sequence in the DNA molecule is at least 70%identical to the nucleotide sequence the expression of which is reduced,more preferably it is at least 80% identical, yet more preferably atleast 90% identical, yet more preferably at least 95% identical, yetmore preferably at least 99% identical.

(2) “Antisense” Suppression

In another preferred embodiment, the alteration of the expression of anucleotide sequence of the present invention, preferably the reductionof its expression is obtained by “anti-sense” suppression. The entiretyor a portion of a nucleotide sequence of the present invention iscomprised in a DNA molecule. The DNA molecule is preferably operativelylinked to a promoter functional in a plant cell, and introduced in aplant cell, in which the nucleotide sequence is expressible. Thenucleotide sequence is inserted in the DNA molecule in the “anti-senseorientation”, meaning that the reverse complement (also called sometimesnoncoding strand) of the nucleotide sequence can be transcribed. In apreferred embodiment, the DNA molecule comprising the nucleotidesequence, or a portion thereof, is stably integrated in the genome ofthe plant cell. In another preferred embodiment the DNA moleculecomprising the nucleotide sequence, or a portion thereof, is comprisedin an extrachromosomally replicating molecule. Several publicationsdescribing this approach are cited for further illustration (Green, P.J. et al., Ann. Rev. Biochem. 55:569-597 (1986); van der Krol, A. R. etal, Antisense Nuc. Acids & Proteins, pp. 125-141 (1991); Abel, P. P. etal., Proc. Natl. Acad. Sci. USA 86:6949-6952 (1989); Ecker, J. R. etal., Proc. Natl. Acad. Sci. USA 83:5372-5376 (August 1986)).

In transgenic plants containing one of the DNA molecules describedimmediately above, the expression of the nucleotide sequencecorresponding to the nucleotide sequence comprised in the DNA moleculeis preferably reduced. Preferably, the nucleotide sequence in the DNAmolecule is at least 70% identical to the nucleotide sequence theexpression of which is reduced, more preferably it is at least 80%identical, yet more preferably at least 90% identical, yet morepreferably at least 95% identical, yet more preferably at least 99%identical.

(3) Homologous Recombination

In another preferred embodiment, at least one genomic copy correspondingto a nucleotide sequence of the present invention is modified in thegenome of the plant by homologous recombination as further illustratedin Paszkowski et al., EMBO Journal 7:4021-26 (1988). This technique usesthe property of homologous sequences to recognize each other and toexchange nucleotide sequences between each by a process known in the artas homologous recombination. Homologous recombination can occur betweenthe chromosomal copy of a nucleotide sequence in a cell and an incomingcopy of the nucleotide sequence introduced in the cell bytransformation. Specific modifications are thus accurately introduced inthe chromosomal copy of the nucleotide sequence. In one embodiment, theregulatory elements of the nucleotide sequence of the present inventionare modified. Such regulatory elements are easily obtainable byscreening a genomic library using the nucleotide sequence of the presentinvention, or a portion thereof, as a probe. The existing regulatoryelements are replaced by different regulatory elements, thus alteringexpression of the nucleotide sequence, or they are mutated or deleted,thus abolishing the expression of the nucleotide sequence. In anotherembodiment, the nucleotide sequence is modified by deletion of a part ofthe nucleotide sequence or the entire nucleotide sequence, or bymutation. Expression of a mutated polypeptide in a plant cell is alsocontemplated in the present invention. More recent refinements of thistechnique to disrupt endogenous plant genes have been described (Kempinet al., Nature 389:802-803 (1997) and Miao and Lam, Plant J., 7:359-365(1995).

In another preferred embodiment, a mutation in the chromosomal copy of anucleotide sequence is introduced by transforming a cell with a chimericoligonucleotide composed of a contiguous stretch of RNA and DNA residuesin a duplex conformation with double hairpin caps on the ends. Anadditional feature of the oligonucleotide is for example the presence of2′-O-methylation at the RNA residues. The RNA/DNA sequence is designedto align with the sequence of a chromosomal copy of a nucleotidesequence of the present invention and to contain the desired nucleotidechange. For example, this technique is further illustrated in U.S. Pat.No. 5,501,967 and Zhu et al. (1999) Proc. Natl. Acad. Sci. USA 96:8768-8773.

(4) Ribozymes

In a further embodiment, the RNA coding for a polypeptide of the presentinvention is cleaved by a catalytic RNA, or ribozyme, specific for suchRNA. The ribozyme is expressed in transgenic plants and results inreduced amounts of RNA coding for the polypeptide of the presentinvention in plant cells, thus leading to reduced amounts of polypeptideaccumulated in the cells. This method is further illustrated in U.S.Pat. No. 4,987,071.

(5) Dominant-Negative Mutants

In another preferred embodiment, the activity of the polypeptide encodedby the nucleotide sequences of this invention is changed. This isachieved by expression of dominant negative mutants of the proteins intransgenic plants, leading to the loss of activity of the endogenousprotein.

(6) Aptamers

In a further embodiment, the activity of polypeptide of the presentinvention is inhibited by expressing in transgenic plants nucleic acidligands, so-called aptamers, which specifically bind to the protein.Aptamers are preferentially obtained by the SELEX (Systematic Evolutionof Ligands by EXponential Enrichment) method. In the SELEX method, acandidate mixture of single stranded nucleic acids having regions ofrandomized sequence is contacted with the protein and those nucleicacids having an increased affinity to the target are partitioned fromthe remainder of the candidate mixture. The partitioned nucleic acidsare amplified to yield a ligand enriched mixture. After severaliterations a nucleic acid with optimal affinity to the polypeptide isobtained and is used for expression in transgenic plants. This method isfurther illustrated in U.S. Pat. No. 5,270,163.

(7) Zinc Finger Proteins

A zinc finger protein that binds a nucleotide sequence of the presentinvention or to its regulatory region is also used to alter expressionof the nucleotide sequence. Preferably, transcription of the nucleotidesequence is reduced or increased. Zinc finger proteins are for exampledescribed in Beerli et al. (1998) PNAS 95:14628-14633, or in WO95/19431, WO 98/54311, or WO 96/06166, all incorporated herein byreference in their entirety.

(8) dsRNA

Alteration of the expression of a nucleotide sequence of the presentinvention is also obtained by dsRNA interference as described forexample in WO 99/32619, WO 99/53050 or WO 99/61631, all incorporatedherein by reference in their entirety.

(9) Insertion of a DNA Molecule (Insertional Mutagenesis)

In another preferred embodiment, a DNA molecule is inserted into achromosomal copy of a nucleotide sequence of the present invention, orinto a regulatory region thereof. Preferably, such DNA moleculecomprises a transposable element capable of transposition in a plantcell, such as e.g. Ac/Ds, Em/Spm, mutator. Alternatively, the DNAmolecule comprises a T-DNA border of an Agrobacterium T-DNA. The DNAmolecule may also comprise a recombinase or integrase recognition sitewhich can be used to remove part of the DNA molecule from the chromosomeof the plant cell. An example of this method is set forth in Example 2.Methods of insertional mutagenesis using T-DNA, transposons,oligonucleotides or other methods known to those skilled in the art arealso encompassed. Methods of using T-DNA and transposon for insertionalmutagenesis are described in Winkler et al. (1989) Methods Mol. Biol.82:129-136 and Martienssen (1998) PNAS 95:2021-2026, incorporated hereinby reference in their entireties.

(10) Deletion Mutagenesis

In yet another embodiment, a mutation of a nucleic acid molecule of thepresent invention is created in the genomic copy of the sequence in thecell or plant by deletion of a portion of the nucleotide sequence orregulator sequence. Methods of deletion mutagenesis are known to thoseskilled in the art. See, for example, Miao et al, (1995) Plant J. 7:359.

In yet another embodiment, this deletion is created at random in a largepopulation of plants by chemical mutagenesis or irradiation and a plantwith a deletion in a gene of the present invention is isolated byforward or reverse genetics. Irradiation with fast neutrons or gammarays is known to cause deletion mutations in plants (Silverstone et al,(1998) Plant Cell, 10:155-169; Bruggemann et al., (1996) Plant J.,10:755-760; Redei and Koncz in Methods in Arabidopsis Research, WorldScientific Press (1992), pp. 16-82). Deletion mutations in a gene of thepresent invention can be recovered in a reverse genetics strategy usingPCR with pooled sets of genomic DNAs as has been shown in C. elegans(Liu et al., (1999), Genome Research, 9:859-867.). A forward geneticsstrategy would involve mutagenesis of a line displaying PTGS followed byscreening the M2 progeny for the absence of PTGS. Among these mutantswould be expected to be some that disrupt a gene of the presentinvention. This could be assessed by Southern blot or PCR for a gene ofthe present invention with genomic DNA from these mutants.

(11) Overexpression in a Plant Cell

In yet another preferred embodiment, a nucleotide sequence of thepresent invention encoding a polypeptide comprising a 3′-5′ exonucleasedomain and/or activity in a plant cell is overexpressed. Examples ofnucleic acid molecules and expression cassettes for overexpression of anucleic acid molecule of the present invention are described above.Methods known to those skilled in the art of over-expression of nucleicacid molecules are also encompassed by the present invention.

In still another embodiment, the expression of the nucleotide sequenceof the present invention is altered in every cell of a plant. This isfor example obtained though homologous recombination or by insertion inthe chromosome. This is also for example obtained by expressing a senseor antisense RNA, zinc finger protein or ribozyme under the control of apromoter capable of expressing the sense or antisense RNA, zinc fingerprotein or ribozyme in every cell of a plant. Constitutive expression,inducible, tissue-specific or developmentally-regulated expression arealso within the scope of the present invention and result in aconstitutive, inducible, tissue-specific or developmentally-regulatedalteration of the expression of a nucleotide sequence of the presentinvention in the plant cell. Constructs for expression of the sense orantisense RNA, zinc finger protein or ribozyme, or for overexpression ofa nucleotide sequence of the present invention, are prepared andtransformed into a plant cell according to the teachings of the presentinvention, e.g. as described infra.

The invention hence also provides sense and anti-sense nucleic acidmolecules corresponding to the open reading flames identified in the SEQID NOs of the Sequence Listing as well as their orthologs.

The genes and open reading frames according to the present inventionwhich are substantially similar to a nucleotide sequence encoding apolypeptide as given in any one of the SEQ ID NOs of the SequenceLisiting including any corresponding antisense constructs can beoperably linked to any promoter that is functional within the plant hostincluding the promoter sequences according to the invention or mutantsthereof.

The present invention further provides a method of augmenting a plantgenome by contacting plant cells with a nucleic acid molecule of theinvention, e.g., one having a nucleotide sequence that directstissue-specific, tissue-preferential transcription of a linked nucleicacid segment isolatable or obtained from a plant gene encoding apolypeptide that is substantially similar to a polypeptide encoded bythe an Oryza gene having a sequence according to any one of SEQ ID NOsprovided in the Sequence Listing so as to yield transformed plant cells;and regenerating the transformed plant cells to provide a differentiatedtransformed plant, wherein the differentiated transformed plantexpresses the nucleic acid molecule in the cells of the plant,preferably in the appropriate tissues of the plant grain. The nucleicacid molecule may be present in the nucleus, chloroplast, mitochondriaand/or plastid of the cells of the plant.

Plant species may be transformed with the DNA construct of the presentinvention by the DNA-mediated transformation of plant cell protoplastsand subsequent regeneration of the plant from the transformedprotoplasts in accordance with procedures well known in the art.

Any plant tissue capable of subsequent clonal propagation, whether byorganogenesis or embryogenesis, may be transformed with a vector of thepresent invention. The term “organogenesis,” as used herein, means aprocess by which shoots and roots are developed sequentially frommeristematic centers; the term “embryogenesis,” as used herein, means aprocess by which shoots and roots develop together in a concertedfashion (not sequentially), whether from somatic cells or gametes. Theparticular tissue chosen will vary depending on the clonal propagationsystems available for, and best suited to, the particular species beingtransformed. Exemplary tissue targets include leaf disks, pollen,embryos, cotyledons, hypocotyls, megagametophytes, callus tissue,existing meristematic tissue (e.g., apical meristems, axillary buds, androot meristems), and induced meristem tissue (e.g., cotyledon meristemand ultilane meristem).

Plants of the present invention may take a variety of forms. The plantsmay be chimeras of transformed cells and nor-transformed cells; theplants may be clonal transformants (e.g., all cells transformed tocontain the expression cassette); the plants may comprise grafts oftransformed and untransformed tissues (e.g., a transformed root stockgrafted to an untransformed scion in citrus species). The transformedplants may be propagated by a variety of means, such as by clonalpropagation or classical breeding techniques. For example, firstgeneration (or T1) transformed plants may be selfed to give homozygoussecond generation (or T2) transformed plants, and the T2 plants furtherpropagated through classical breeding techniques. A dominant selectablemarker (such as npt II) can be associated with the expression cassetteto assist in breeding.

Thus, the present invention provides a transformed (transgenic) plantcell in planta or ex planta, including a transformed plastid or otherorganelle, e.g., nucleus, mitochondria or chloroplast. The presentinvention may be used for transformation of any plant species,including, but not limited to, cells from corn (Zea mays), Brassica sp.(e.g., B. napus, B. rapa, B. juncea), particularly those Brassicaspecies useful as sources of seed oil, alfalfa (Medicago sativa), rice(Oryza sativa), rye (Secale cereale), sorghum (Sorghum bicolor, Sorghumvulgare), millet (e.g., pearl millet (Pennisetum glaucum), proso millet(Panicum miliaceum), foxtail millet (Setaria italica), finger millet(Eleusine coracana)), sunflower (Helianthus annuus), safflower(Carthamus tinctorius), wheat (Triticum aestivum), soybean (Glycinemax), tobacco (Nicotiana tabacum), potato (Solanum tuberosum), peanuts(Arachis hypogaea), cotton (Gossypium barbadense, Gossypium hirsutum),sweet potato (Ipomoea batatus), cassaya (Manihot esculenta), coffee(Cofea spp.), coconut (Cocos nucifera), pineapple (Ananas comosus),citrus trees (Citrus spp.), cocoa (Theobroma cacao), tea (Camelliasinensis), banana (Musa spp.), avocado (Persea ultilane), fig (Ficuscasica), guava (Psidium guajava), mango (Mangifera indica), olive (Oleaeuropaea), papaya (Carica papaya), cashew (Anacardium occidentale),macadamia (Macadamia integrifolia), almond (Prunus amygdalus), sugarbeets (Beta vulgaris), sugarcane (Saccharum spp.), oats, duckweed(Lemna), barley, vegetables, ornamentals, and conifers.

Duckweed (Lemna, see WO 00/07210) includes members of the familyLemnaceae. There are known four genera and 34 species of duckweed asfollows: genus Lemna (L. aequinoctialis, L. disperma, L. ecuadoriensis,L. gibba, L. japonica, L. minor, L. miniscula, L. obscura, L.perpusilla, L. tenera, L. trisulca, L.turionifera, L. valdiviana); genusSpirodela (S. intermedia, S. polyrrhiza, S. punctata); genus Woffia (Wa.Angusta, Wa. Arrhiza, Wa. Australina, Wa. Borealis, Wa. Brasiliensis,Wa. Columbiana, Wa. Elongata, Wa. Globosa, Wa. Microscopica, Wa.Neglecta) and genus Wofiella (Wl. ultila, Wl. ultilanen, Wl. gladiala,Wl. ultila, Wl. lingulata, Wl. repunda, Wl. rotunda, and Wl.neotropica). Any other genera or species of Lemnaceae, if they exist,are also aspects of the present invention. Lemna gibba, Lemnaminor, andLemna miniscula are preferred, with Lemnaminor and Lemna miniscula beingmost preferred. Lemna species can be classified using the taxonomicscheme described by Landolt, Biosystematic Investigation on the Familyof Duckweeds: The family of Lemnaceae—A Monograph Study. GeobatanischenInstitut ETH, Stiflung Rubel, Zurich (1986)).

Vegetables within the scope of the invention include tomatoes(Lycopersicon esculentum), lettuce (e.g., Lactuca sativa), green beans(Phaseolus vulgaris), lima beans (Phaseolus limensis), peas (Lathyrusspp.), and members of the genus Cucumis such as cucumber (C. sativus),cantaloupe (C. cantalupensis), and musk melon (C. melo). Ornamentalsinclude azalea (Rhododendron spp.), hydrangea (Macrophylla hydrangea),hibiscus (Hibiscus rosasanensis), roses (Rosa spp.), tulips (Tulipaspp.), daffodils (Narcissus spp.), petunias (Petunia hybrida), carnation(Dianthus caryophyllus), poinsettia (Euphorbia pulcherrima), andchrysanthemum. Conifers that may be employed in practicing the presentinvention include, for example, pines such as loblolly pine (Pinustaeda), slash pine (Pinus elliotii), ponderosa pine (Pinus ponderosa),lodgepole pine (Pinus contorta), and Monterey pine (Pinus radiata),Douglas-fir (Pseudotsuga menziesii); Western hemlock (Tsuga ultilane);Sitka spruce (Picea glauca); redwood (Sequoia sempervirens); true firssuch as silver fir (Abies amabilis) and balsam fir (Abies balsamea); andcedars such as Western red cedar (Thuja plicata) and Alaska yellow-cedar(Chamaecyparis nootkatensis). Leguminous plants include beans and peas.Beans include guar, locust bean, fenugreek, soybean, garden beans,cowpea, mungbean, lima bean, fava bean, lentils, chickpea, etc. Legumesinclude, but are not limited to, Arachis, e.g., peanuts, Vicia, e.g.,crown vetch, hairy vetch, adzuki bean, mung bean, and chickpea, Lupinus,e.g., lupine, trifolium, Phaseolus, e.g., common bean and lima bean,Pisum, e.g., field bean, Melilotus, e.g., clover, Medicago, e.g.,alfalfa, Lotus, e.g., trefoil, lens, e.g., lentil, and false indigo.Preferred forage and turf grass for use in the methods of the inventioninclude alfalfa, orchard grass, tall fescue, perennial ryegrass,creeping bent grass, and redtop.

Other plants within the scope of the invention include Acacia, aneth,artichoke, arugula, blackberry, canola, cilantro, clementines, escarole,eucalyptus, fennel, grapefruit, honey dew, jicama, kiwifruit, lemon,lime, mushroom, nut, okra, orange, parsley, persimmon, plantain,pomegranate, poplar, radiata pine, radicchio, Southern pine, sweetgum,tangerine, triticale, vine, yams, apple, pear, quince, cherry, apricot,melon, hemp, buckwheat, grape, raspberry, chenopodium, blueberry,nectarine, peach, plum, strawberry, watermelon, eggplant, pepper,cauliflower, Brassica, e.g., broccoli, cabbage, ultilan sprouts, onion,carrot, leek, beet, broad bean, celery, radish, pumpkin, endive, gourd,garlic, snapbean, spinach, squash, turnip, ultilane, and zucchini.

Ornamental plants within the scope of the invention include impatiens,Begonia, Pelargonium, Viola, Cyclamen, Verbena, Vinca, Tagetes, Primula,Saint Paulia, Agertum, Amaranthus, Antihirrhinum, Aquilegia, Cineraria,Clover, Cosmo, Cowpea, Dahlia, Datura, Delphinium, Gerbera, Gladiolus,Gloxinia, Hippeastrum, Mesembryanthemum, Salpiglossos, and Zinnia. Otherplants within the scope of the invention are shown in Table 1 (above).

Preferably, transgenic plants of the present invention are crop plantsand in particular cereals (for example, corn, alfalfa, sunflower, rice,Brassica, canola, soybean, barley, soybean, sugarbeet, to cotton,safflower, peanut, sorghum, wheat, millet, tobacco, etc.), and even morepreferably corn, rice and soybean.

The present invention also provides a transgenic plant prepared by thismethod, a seed from such a plant and progeny plants from such a plantincluding hybrids and inbreds. Preferred transgenic plants aretransgenic maize, soybean, barley, alfalfa, sunflower, canola, soybean,cotton, peanut, sorghum, tobacco, sugarbeet, rice, wheat, rye,turfgrass, millet, sugarcane, tomato, or potato.

A transformed (transgenic) plant of the invention includes plants, thegenome of which is augmented by a nucleic acid molecule of theinvention, or in which the corresponding gene has been disrupted, e.g.,to result in a loss, a decrease or an alteration, in the function of theproduct encoded by the gene, which plant may also have increased yieldsand/or produce a better-quality product than the corresponding wild-typeplant. The nucleic acid molecules of the invention are thus useful fortargeted gene disruption, as well as markers and probes.

The invention also provides a method of plant breeding, e.g., to preparea crossed fertile transgenic plant. The method comprises crossing afertile transgenic plant comprising a particular nucleic acid moleculeof the invention with itself or with a second plant, e.g., one lackingthe particular nucleic acid molecule, to prepare the seed of a crossedfertile transgenic plant comprising the particular nucleic acidmolecule. The seed is then planted to obtain a crossed fertiletransgenic plant. The plant may be a monocot or a dicot. In a particularembodiment, the plant is a cereal plant.

The crossed fertile transgenic plant may have the particular nucleicacid molecule inherited through a female parent or through a maleparent. The second plant may be an inbred plant. The crossed fertiletransgenic may be a hybrid. Also included within the present inventionare seeds of any of these crossed fertile transgenic plants.

Transformation of plants can be undertaken with a single DNA molecule ormultiple DNA molecules (i.e., co-transformation), and both thesetechniques are suitable for use with the expression cassettes of thepresent invention. Numerous transformation vectors are available forplant transformation, and the expression cassettes of this invention canbe used in conjunction with any such vectors. The selection of vectorwill depend upon the preferred transformation technique and the targetspecies for transformation.

A variety of techniques are available and known to those skilled in theart for introduction of constructs into a plant cell host. Thesetechniques generally include transformation with DNA employing A.tumefaciens or A. rhizogenes as the transforming agent, liposomes, PEGprecipitation, electroporation, DNA injection, direct DNA uptake,microprojectile bombardment, particle acceleration, and the like (See,for example, EP 295959 and EP 138341) (see below). However, cells otherthan plant cells may be transformed with the expression cassettes of theinvention. The general descriptions of plant expression vectors andreporter genes, and Agrobacterium and Agrobacterium-mediated genetransfer, can be found in Gruber et al. (1993).

Expression vectors containing genomic or synthetic fragments can beintroduced into protoplasts or into intact tissues or isolated cells.Preferably expression vectors are introduced into intact tissue. Generalmethods of culturing plant tissues are provided for example by Maki etal., (1993); and by Phillips et al. (1988). Preferably, expressionvectors are introduced into maize or other plant tissues using a directgene transfer method such as microprojectile-mediated delivery, DNAinjection, electroporation and the like. More preferably expressionvectors are introduced into plant tissues using the microprojectilemedia delivery with the biolistic device. See, for example, Tomes et al.(1995). The vectors of the invention can not only be used for expressionof structural genes but may also be used in exon-trap cloning, orpromoter trap procedures to detect differential gene expression invarieties of tissues, (Lindsey et al., 1993; Auch & Reth et al.).

It is particularly preferred to use the binary type vectors of Ti and Riplasmids of Agrobacterium spp. Ti-derived vectors transform a widevariety of higher plants, including monocotyledonous and dicotyledonousplants, such as soybean, cotton, rape, tobacco, and rice (Pacciotti etal., 1985: Byrne et al., 1987; Sukhapinda et al., 1987; Lorz et al.,1985; Potrykus, 1985; Park et al., 1985: Hiei et al., 1994). The use ofT-DNA to transform plant cells has received extensive study and is amplydescribed (EP 120516; Hoekema, 1985; Knauf, et al., 1983; and An et al.,1985). For introduction into plants, the chimeric genes of the inventioncan be inserted into binary vectors as described in the examples.

Other transformation methods are available to those skilled in the art,such as direct uptake of foreign DNA constructs (see EP 295959),techniques of electroporation (Fromm et al., 1986) or high velocityballistic bombardment with metal particles coated with the nucleic acidconstructs (Kine et al., 1987, and U.S. Pat. No. 4,945,050). Oncetransformed, the cells can be regenerated by those skilled in the art.Of particular relevance are the recently described methods to transformforeign genes into commercially important crops, such as rapeseed (DeBlock et al., 1989), sunflower (Everett et al., 1987), soybean (McCabeet al., 1988; Hinchee et al., 1988; Chee et al., 1989; Christou et al.,1989; EP 301749), rice (Hiei et al., 1994), and corn (Gordon Kamm etal., 1990; Fromm et al., 1990).

Those skilled in the art will appreciate that the choice of method mightdepend on the type of plant, i.e., monocotyledonous or dicotyledonous,targeted for transformation. Suitable methods of transforming plantcells include, but are not limited to, microinjection (Crossway et al.,1986), electroporation (Riggs et al., 1986), Agrobacterium-mediatedtransformation (Hinchee et al., 1988), direct gene transfer (Paszkowskiet al., 1984), and ballistic particle acceleration using devicesavailable from Agracetus, Inc., Madison, Wis. And BioRad, Hercules,Calif. (see, for example, Sanford et al., U.S. Pat. No. 4,945,050; andMcCabe et al., 1988). Also see, Weissinger et al., 1988; Sanford et al.,1987 (onion); Christou et al., 1988 (soybean); McCabe et al., 1988(soybean); Datta et al., 1990 (rice); Klein et al., 1988 (maize); Kleinet al., 1988 (maize); Klein et al., 1988 (maize); Fromm et al., 1990(maize); and Gordon-Kamm et al., 1990 (maize); Svab et al., 1990(tobacco chloroplast); Koziel et al., 1993 (maize); Shimamoto et al.,1989 (rice); Christou et al., 1991 (rice); European Patent ApplicationEP 0 332 581 (orchardgrass and other Pooideae); Vasil et al., 1993(wheat); Weeks et al., 1993 (wheat). In one embodiment, the protoplasttransformation method for maize is employed (European Patent ApplicationEP 0 292 435, U.S. Pat. No. 5,350,689).

In another embodiment, a nucleotide sequence of the present invention isdirectly transformed into the plastid genome. Plastid transformationtechnology is extensively described in U.S. Pat. Nos. 5,451,513,5,545,817, and 5,545,818, in PCT application no. WO 95/16783, and inMcBride et al., 1994. The basic technique for chloroplast transformationinvolves introducing regions of cloned plastid DNA flanking a selectablemarker together with the gene of interest into a suitable target tissue,e.g., using biolistics or protoplast transformation (e.g., calciumchloride or PEG mediated transformation). The 1 to 1.5 kb flankingregions, termed targeting sequences, facilitate orthologousrecombination with the plastid genome and thus allow the replacement ormodification of specific regions of the plastome. Initially, pointmutations in the chloroplast 16S rRNA and rps12 genes conferringresistance to spectinomycin and/or streptomycin are utilized asselectable markers for transformation (Svab et al., 1990; Staub et al.,1992). This resulted in stable homoplasmic as transformants at afrequency of approximately one per 100 bombardments of target leaves.The presence of cloning sites between these markers allowed creation ofa plastid targeting vector for introduction of foreign genes (Staub etal., 1993). Substantial increases in transformation frequency areobtained by replacement of the recessive rRNA or r-protein antibioticresistance genes with a dominant selectable marker, the bacterial aadAgene encoding the spectinomycin-detoxifying enzymeaminoglycoside-3N-adenyltransferase (Svab et al., 1993). Otherselectable markers useful for plastid transformation are known in theart and encompassed within the scope of the invention. Typically,approximately 15-20 cell division cycles following transformation arerequired to reach a homoplastidic state. Plastid expression, in whichgenes are inserted by orthologous recombination into all of the severalthousand copies of the circular plastid genome present in each plantcell, takes advantage of the enormous copy number advantage overnuclear-expressed genes to permit expression levels that can readilyexceed 10% of the total soluble plant protein. In a preferredembodiment, a nucleotide sequence of the present invention is insertedinto a plastid targeting vector and transformed into the plastid genomeof a desired plant host. Plants homoplastic for plastid genomescontaining a nucleotide sequence of the present invention are obtained,and are preferentially capable of high expression of the nucleotidesequence.

Agrobacterium tumefaciens cells containing a vector comprising anexpression cassette of the present invention, wherein the vectorcomprises a Ti plasmid, are useful in methods of making transformedplants. Plant cells are infected with an Agrobacterium tumefaciens asdescribed above to produce a transformed plant cell, and then a plant isregenerated from the transformed plant cell. Numerous Agrobacteriumvector systems useful in carrying out the present invention are known.

For example, vectors are available for transformation usingAgrobacterium tumefaciens. These typically carry at least one T-DNAborder sequence and include vectors such as pBIN19 (Bevan, 1984). In onepreferred embodiment, the expression cassettes of the present inventionmay be inserted into either of the binary vectors pCIB200 and pCIB2001for use with Agrobacterium. These vector cassettes forAgrobacterium-mediated transformation wear constructed in the followingmanner. PTJS75kan was created by Narl digestion of pTJS75 (Schmidhauser& Helinski, 1985) allowing excision of the tetracycline-resistance gene,followed by insertion of an Accl fragment from pUC4K carrying an NPTh(Messing & Vierra, 1982; Bevan et al., 1983; McBride et al., 1990). XhoIlinkers were ligated to the EcoRV fragment of pCIB7 which contains theleft and right T-DNA borders, a plant selectable nos/nptII chimeric geneand the pUC polylinker (Rothstein et al., 1987), and the XhoI-digestedfragment was cloned into SalI-digested pTJS75kan to create pCIB200 (seealso EP 0 332 104, example 19). PCIB200 contains the following uniquepolylinker restriction sites: EcoRI, SstI, KpnI, BgfII, XbaI, and SalI.The plasmid pCIB2001 is a derivative of pCIB200 which was created by theinsertion into the polylinker of additional restriction sites. Uniquerestriction sites in the polylinker of pCIB2001 are EcoRI, SstI, KpnI,BglII, XbaI, SalI, MluI, BclI, AvrII, ApaI, HpaI, and StuI. PCIB2001, inaddition to containing these unique restriction sites also has plant andbacterial kanamycin selection, left and right T-DNA borders forAgrobacterium-mediated transformation, the RK2-derived trfA function formobilization between E. coli and other hosts, and the OnT and OriVfunctions also from RK2. The pCIB2001 polylinker is suitable for thecloning of plant expression cassettes containing their own regulatorysignals.

An additional vector useful for Agrobacterium-mediated transformation isthe binary vector pCIB 10, which contains a gene encoding kanamycinresistance for selection in plants, T-DNA right and left bordersequences and incorporates sequences from the wide host-range plasmidpRK252 allowing it to replicate in both E. coli and Agrobacterium. Itsconstruction is described by Rothstein et al., 1987. Various derivativesof pCIB10 have been constructed which incorporate the gene forhygromycin B phosphotransferase described by Gritz et al., 1983. Thesederivatives enable selection of transgenic plant cells on hygromycinonly (pCIB743), or hygromycin and kananycin (pCIB715, pCIB717).

Methods using either a form of direct gene transfer orAgrobacterium-mediated transfer usually, but not necessarily, areundertaken with a selectable marker which may provide resistance to anantibiotic (e.g., kanamycin, hygromycin or methotrexate) or a herbicide(e.g., phosphinothricin). The choice of selectable marker for planttransformation is not, however, critical to the invention.

For certain plant species, different antibiotic or herbicide selectionmarkers may be preferred. Selection markers used routinely intransformation include the nptlI gene which confers resistance tokanamycin and related antibiotics (Messing & Vierra, 1982; Bevan et al.,1983), the bar gene which confers resistance to the herbicidephosphinothricin (White et al., 1990, Spencer et al., 1990), the hphgene which confers resistance to the antibiotic hygromycin (Blochinger &Diggelmann), and the dhfr gene, which confers resistance to methotrexate(Bourouis et al., 1983).

Selection markers resulting in positive selection, such as aphosphomannose isomerase gene, as described in patent application WO93/05163, are also used. Other genes to be used for positive selectionare described in WO 94/20627 and encode xyloisomerases andphosphomanno-isomerases such as mannose-6-phosphate isomerase andmannose-1-phosphate isomerase; phosphomanno mutase; mannose epimerasessuch as those which convert carbohydrates to mannose or mannose tocarbohydrates such as glucose or galactose; phosphatases such as mannoseor xylose phosphatase, mannose-6-phosphatase and mannose-1-phosphatase,and permeases which are involved in the transport of mannose, or aderivative, or a precursor thereof into the cell. The agent whichreduces the toxicity of the compound to the cells is typically a glucosederivative such as methyl-3-O-glucose or phloridzin. Transformed cellsare identified without damaging or killing the non-transformed cells inthe population and without co-introduction of antibiotic or herbicideresistance genes. As described in WO 93/05163, in addition to the factthat the need for antibiotic or herbicide resistance genes iseliminated, it has been shown that the positive selection method isoften far more efficient than traditional negative selection.

One vector useful for direct gene transfer techniques in combinationwith selection by the herbicide Basta (or phosphinothricin) is pCIB3064.This vector is based on the plasmid pCIB246, which comprises the CaMV35S promoter in operational fusion to the E. coli GUS gene and the CaMV35S transcriptional terminator and is described in the PCT publishedapplication WO 93/07278, herein incorporated by reference. One geneuseful for conferring resistance to phosphinothricin is the bar genefrom Streptomyces viridochromogenes (Thompson et al., 1987). This vectoris suitable for the cloning of plant expression cassettes containingtheir own regulatory signals.

An additional transformation vector is pSOG35 which utilizes the E. coligene dihydrofolate reductase (DHFR) as a selectable marker conferringresistance to methotrexate. PCR was used to amplify the 35S promoter(about 800 bp), intron 6 from the maize Adh1 gene (about 550 bp) and 18bp of the GUS untranslated leader sequence from pSOG10. A 250 bpfragment encoding the E. coli dihydrofolate reductase type II gene wasalso amplified by PCR and these two PCR fragments were assembled with aSacI-PstI fragment from pB1221 (Clontech) which comprised the pUC19vector backbone and the nopaline synthase terminator. Assembly of thesefragments generated pSOG19 which contains the 35S promoter in fusionwith the intron 6 sequence, the GUS leader, the DHFR gene and thenopaline synthase terminator. Replacement of the GUS leader in pSOG19with the leader sequence from Maize Chlorotic Mottle Virus check (MCMV)generated the vector pSOG35. pSOG19 and pSOG35 carry the pUC-derivedgene for ampicillin resistance and have HindIII, SphI, PstI and EcoRIsites available for the cloning of foreign sequences.

Binary backbone vector pNOV2117 contains the T-DNA portion flanked bythe right and left border sequences, and including the Positcch™(Syngenta) plant selectable marker and the “grain filling candidategene” gene expression cassette. The Positech™ plant selectable markerconfers resistance to mannose and in this instance consists of the maizeubiquitin promoter driving expression of the PMI (phosphomannoseisomerase) gene, followed by the cauliflower mosaic virustranscriptional terminator.

Transgenic plant cells are then placed in an appropriate selectivemedium for selection of transgenic cells which are then grown to callus.Shoots are grown from callus and plantlets generated from the shoot bygrowing in rooting medium. The various constructs normally will bejoined to a marker for selection in plant cells. Conveniently, themarker may be resistance to a biocide (particularly an antibiotic, suchas kanamycin, G418, bleomycin, hygromycin, chloramphenicol, herbicide,or the like). The particular marker used will allow for selection oftransformed cells as compared to cells lacking the DNA which has beenintroduced. Components of DNA constructs including transcriptioncassettes of this invention may be prepared from sequences which arenative (endogenous) or foreign (exogenous) to the host. By “foreign” itis meant that the sequence is not found in the wild-type host into whichthe construct is introduced. Heterologous constructs will contain atleast one region which is not native to the gene from which thetranscription-initiation-region is derived.

To confirm the presence of the transgenes in transgenic cells andplants, a variety of assays may be performed. Such assays include, forexample, “molecular biological” assays well known to those of skill inthe art, such as Southern and Northern blotting, in situ hybridizationand nucleic acid-based amplification methods such as PCR or RT-PCR;“biochemical” assays, such as detecting the presence of a proteinproduct, e.g., by immunological means (ELISAs and Western blots) or byenzymatic function; plant part assays, such as seed assays; and also, byanalyzing the phenotype of the whole regenerated plant, e.g., fordisease or pest resistance.

DNA may be isolated from cell lines or any plant parts to determine thepresence of the preselected nucleic acid segment through the use oftechniques well known to those skilled in the art. Note that intactsequences will not always be present, presumably due to rearrangement ordeletion of sequences in the cell.

The presence of nucleic acid elements introduced through the methods ofthis invention may be determined by polymerase chain reaction (PCR).Using this technique discreet fragments of nucleic acid are amplifiedand detected by gel electrophoresis. This type of analysis permits oneto determine whether a preselected nucleic acid segment is present in astable transformant, but does not prove integration of the introducedpreselected nucleic acid segment into the host cell genome. In addition,it is not possible using PCR techniques to determine whethertransformants have exogenous genes introduced into different sites inthe genome, i.e., whether transformants are of independent origin. It iscontemplated that using PCR techniques it would be possible to clonefragments of the host genomic DNA adjacent to an introduced preselectedDNA segment.

Positive proof of DNA integration into the host genome and theindependent identities of transformants may be determined using thetechnique of Southern hybridization. Using this technique specific DNAsequences that were introduced into the host genome and flanking hostDNA sequences can be identified. Hence the Southern hybridizationpattern of a given transformant serves as an identifying characteristicof that transformant. In addition it is possible through Southernhybridization to demonstrate the presence of introduced preselected DNAsegments in high molecular weight DNA, i.e., confirm that the introducedpreselected DNA segment has been integrated into the host cell genome.The technique of Southern hybridization provides information that isobtained using PCR, e.g., the presence of a preselected DNA segment, butalso demonstrates integration into the genome and characterizes eachindividual transformant.

It is contemplated that using the techniques of dot or slot blothybridization which are modifications of Southern hybridizationtechniques one could obtain the same information that is derived fromPCR, e.g., the presence of a preselected DNA segment.

Both PCR and Southern hybridization techniques can be used todemonstrate transmission of a preselected DNA segment to progeny. Inmost instances the characteristic Southern hybridization pattern for agiven transformant will segregate in progeny as one or more Mendeliangenes (Spencer et al., 1992); Laursen et al., 1994) indicating stableinheritance of the gene. The nonchimeric nature of the callus and theparental transformants (R₀) was suggested by germline transmission andthe identical Southern blot hybridization patterns and intensities ofthe transforming DNA in callus, R₀ plants and R₁ progeny that segregatedfor the transformed gene.

Whereas DNA analysis techniques may be conducted using DNA isolated fromany part of a plant, RNA may only be expressed in particular cells ortissue types and hence it will be necessary to prepare RNA for analysisfrom these tissues. PCR techniques may also be used for detection andquantitation of RNA produced from introduced preselected DNA segments.In this application of PCR it is first necessary to reverse transcribeRNA into DNA, using enzymes such as reverse transcriptase, and thenthrough the use of conventional PCR techniques amplify the DNA. In mostinstances PCR techniques, while useful, will not demonstrate integrityof the RNA product. Further information about the nature of the RNAproduct may be obtained by Northern blotting. This technique willdemonstrate the presence of an RNA species and give information aboutthe integrity of that RNA. The presence or absence of an RNA species canalso be determined using dot or slot blot Northern hybridizations. Thesetechniques are modifications of Northern blotting and will onlydemonstrate the presence or absence of an RNA species.

While Southern blotting and PCR may be used to detect the preselectedDNA segment in question, they do not provide information as to whetherthe preselected DNA segment is being expressed. Expression may beevaluated by specifically identifying the protein products of theintroduced preselected DNA segments or evaluating the phenotypic changesbrought about by their expression.

Assays for the production and identification of specific proteins maymake use of physical chemical, structural, functional, or otherproperties of the proteins. Unique physicachemical or structuralproperties allow the proteins to be separated and identified byelectrophoretic procedures, such as native or denaturing gelelectrophoresis or isoelectric focusing, or by chromatographictechniques such as ion exchange or gel exclusion chromatography. Theunique structures of individual proteins offer opportunities for use ofspecific antibodies to detect their presence in formats such as an ELISAassay. Combinations of approaches may be employed with even greaterspecificity such as Western blotting in which antibodies are used tolocate individual gene products that have been separated byelectrophoretic techniques. Additional techniques may be employed toabsolutely confirm the identity of the product of interest such asevaluation by amino acid sequencing following purification. Althoughthese are among the most commonly employed, other procedures may beadditionally used.

Assay procedures may also be used to identify the expression of proteinsby their functionality, especially the ability of enzymes to catalyzespecific chemical reactions involving specific substrates and products.These reactions may be followed by providing and quantifying the loss ofsubstrates or the generation of products of the reactions by physical orchemical procedures. Examples are as varied as the enzyme to beanalyzed.

Very frequently the expression of a gene product is determined byevaluating the phenotypic results of its expression. These assays alsomay take many forms including but not limited to analyzing changes inthe chemical composition, morphology, or physiological properties of theplant. Morphological changes may include greater stature or thickerstalks. Most often changes in response of plants or plant parts toimposed treatments are evaluated under carefully controlled conditionstermed bioassays.

The compositions of the invention include plant nucleic acid molecules,and the amino acid sequences for the polypeptides or partial-lengthpolypeptides encoded by the nucleic acid molecule which comprises anopen reading frame. These sequences can be employed to alter expressionof a particular gene corresponding to the open reading frame bydecreasing or eliminating expression of that plant gene or byoverexpressing a particular gene product. Methods of this embodiment ofthe invention include stably transforming a plant with the nucleic acidmolecule of the invention which includes an open reading frame operablylinked to a promoter capable of driving expression of that open readingframe (sense or antisense) in a plant cell. By “portion” or “fragment”,as it relates to a nucleic acid molecule which comprises an open readingframe or a fragment thereof encoding a partial-length polypeptide havingthe activity of the full length polypeptide, is meant a sequence havingat least 80 nucleotides, more preferably at least 150 nucleotides, andstill more preferably at least 400 nucleotides. If not employed forexpressing, a “portion” or “fragment” means at least 9, preferably 12,more preferably 15, even more preferably at least 20, consecutivenucleotides, e.g., probes and primers (oligonucleotides), correspondingto the nucleotide sequence of the nucleic acid molecules of theinvention. Thus, to express a particular gene product, the methodcomprises introducing to a plant, plant cell, or plant tissue anexpression cassette comprising a promoter linked to an open readingframe so as to yield a transformed differentiated plant, transformedcell or transformed tissue. Transformed cells or tissue can beregenerated to provide a transformed differentiated plant. Thetransformed differentiated plant or cells thereof preferably expressesthe open reading frame in an amount that alters the amount of the geneproduct in the plant or cells thereof, which product is encoded by theopen reading frame. The present invention also provides a transformedplant prepared by the method, progeny and seed thereof.

The invention further includes a nucleotide sequence which iscomplementary to one (hereinafter “test” sequence) which hybridizesunder stringent conditions with a nucleic acid molecule of the inventionas well as RNA which is transcribed from the nucleic acid molecule. Whenthe hybridization is performed under stringent conditions, either thetest or nucleic acid molecule of invention is preferably supported,e.g., on a membrane or DNA chip. Thus, either a denatured test ornucleic acid molecule of the invention is preferably first bound to asupport and hybridization is effected for a specified period of time ata temperature of, e.g., between 55 and 70° C., in double strengthcitrate buffered saline (SC) containing 0.1% SDS followed by rinsing ofthe support at the same temperature but with a buffer having a reducedSC concentration. Depending upon the degree to of stringency requiredsuch reduced concentration buffers are typically single strength SCcontaining 0.1% SDS, half strength SC containing 0.1% SDS and one-tenthstrength SC containing 0.1% SDS.

In a further embodiment, the present invention provides a transformedplant host cell, or one obtained through breeding, capable ofover-expressing, under-expressing, or having a knock out of amino acidgenes and/or their gene products. The plant cell is transformed with atleast one such expression vector wherein the plant host cell can be usedto regenerate plant tissue or an entire plant, or seed there from, inwhich the effects of expression, including overexpression orunderexpression, of the introduced sequence or sequences can be measuredin vitro or in planta.

Polynucleotides derived from the nucleic acid molecules of the presentinvention having any of the nucleotide sequences of SEQ ID NO: 1 to 461and 501 to 511, respectively, encoding a polypeptide the expression ofwhich is up-regulated during grain filling, are useful to detect thepresence in a test sample of at least one copy of a nucleotide sequencecontaining the same or substantially the same sequence, or a fragment,complement, or variant thereof. The sequence of the probes and/orprimers of the instant invention need not be identical to those providedin the Sequence Listing or the complements thereof. Some variation inprobe or primer sequence and/or length can allow additional familymembers to be detected, as well as orthologous genes and moretaxonomically distant related sequences. Similarly probes and/or primersof the invention can include additional nucleotides that serve as alabel for detecting duplexes, for isolation of duplexed polynucleotides,or for cloning purposes.

Preferred probes and primers of the invention include isolated,purified, or recombinant polynucleotides containing a contiguous span ofbetween at least 12 to at least 1000 nucleotides of any nucleotidsequence which is substantially similar, and preferably has at leastbetween 70% and 99% sequence identity to any one of SEQ ID NO: 1 to 461,501-511, and 513-641, respectively, encoding a polypeptide theexpression of which is up-regulated during grain filing, or thecomplements thereof, with each individual number of nucleotides withinthis range also being part of the invention. Preferred are isolated,purified, or recombinant polynucleotides containing a contiguous span ofat least 12, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150,200, 300, 400, 500, 750, or 1000 nucleotides of any nucleotide sequencewhich is substantially similar, and preferably has at least between 70%and 99%, sequence identity to any one of SEQ ID NO: 1 to 461, 501-511,and 513-641, respectively, encoding a polypeptide the expression ofwhich is up-regulated during grain filling, or the complements thereof.The appropriate length for primers and probes will vary depending on theapplication. For use as PCR primers, probes are 12-40 nucleotides,preferably 18-30 nucleotides long. For use in mapping, probes are 50 to500 nucleotides, preferably 100-250 nucleotides long. For use inSouthern hybridizations, probes as long as several kilobases can beused. The appropriate length for primers and probes under a particularset of assay conditions may be empirically determined by one of skill inthe art.

The primers and probes can be prepared by any suitable method,including, for example, cloning and restriction of appropriate sequencesand direct chemical synthesis by a method such as the phosphodiestermethod of Narang et al. (Meth Enzymol 68: 90 (1979)), thediethylphosphoramidite method, the triester method of Matteucci et al.(J Am Chem Soc 103: 3185 (1981)), or according to Urdea et al. (ProcNatl Acad 80: 7461 (1981)), the solid support method described in EP 0707 592, or using commercially available automated oligonucleotidesynthesizers.

Detection probes are generally nucleotide sequences or unchargednucleotide analogs such as, for example peptide nucleotides which aredisclosed in International Patent Application WO 92/20702, morpholinoanalogs which are described in U.S. Pat. Nos. 5,185,444, 5,034,506 and5,142,047. The probe may have to be rendered “non-extendable” such thatadditional dNTPs cannot be added to the probe. Analogs are usuallynonextendable, and nucleotide probes can be rendered non-extendable bymodifying the 3′ end of the probe such that the hydroxyl group is nolonger capable of participating in elongation. For example, the 3′ endof the probe can be functionalized with the capture or detection labelto thereby consume or otherwise block the hydroxyl group. Alternatively,the 3′ hydroxyl group simply can be cleaved, replaced or modified so asto render the probe non-extendable.

Any of the polynucleotides of the present invention can be labeled, ifdesired, by incorporating a label detectable by spectroscopic,photochemical, biochemical, immunochemical, or chemical means. Forexample, useful labels include radioactive substances (³²P, ³⁵S, ³H,²⁵I), fluorescent dyes (5-bromodesoxyuridine, fluorescein,acetylaminofluorene, digoxigenin) or biotin. Preferably, polynucleotidesare labeled at their 3′ and 5′ ends. Examples of non-radioactivelabeling of nucleotide fragments are described in the French patent No.FR-7810975 and by Urdea et al. (Nuc Acids Res 16:4937 (1988)). Inaddition, the probes according to the present invention may havestructural characteristics such that they allow the signalamplification, such structural characteristics being, for example,branched DNA probes as described in EP 0 225 807.

A label can also be used to capture the primer so as to facilitate theimmobilization of either the primer or a primer extension product, suchas amplified DNA, on a solid support. A capture label is attached to theprimers or probes and can be a specific binding member that forms abinding pair with the solid's phase reagent's specific binding member,for example biotin and streptavidin. Therefore depending upon the typeof label carried by a polynucleotide or a probe, it may be employed tocapture or to detect the target DNA. Further, it will be understood thatthe polynucleotides, primers or probes provided herein, may, themselves,serve as the capture label. For example, in the case where a solid phasereagent's binding member is a nucleotide sequence, it may be selectedsuch that it binds a complementary portion of a primer or probe tothereby immobilize the primer or probe to the solid phase. In caseswhere a polynucleotide probe itself serves as the binding member, thoseskilled in the art will recognize that the probe will contain a sequenceor “ail” that is not complementary to the target. In the case where apolynucleotide primer itself serves as the capture label, at least aportion of the primer will be free to hybridize with a nucleotide on asolid phase. DNA labeling techniques are well known in the art.

Any of the polynucleotides, primers and probes of the present inventioncan be conveniently immobilized on a solid support. Solid supports areknown to those skilled in the art and include the walls of wells of areaction tray, test tubes, polystyrene beads, magnetic beads,nitrocellulose strips, membranes, microparticles such as latexparticles, sheep (or other animal) red blood cells, duracytes andothers. The solid support is not critical and can be selected by oneskilled in the art. Thus, latex particles, microparticles, magnetic ornon-magnetic beads, membranes, plastic tubes, walls of microtiter wells,glass or silicon chips, sheep (or other suitable animal's) red bloodcells and duracytes are all suitable examples. Suitable methods forimmobilizing nucleotides on solid phases include ionic, hydrophobic,covalent interactions and the like. A solid support, as used herein,refers to any material that is insoluble, or can be made insoluble by asubsequent reaction. The solid support can be chosen for its intrinsicability to attract and immobilize the capture reagent. Alternatively,the solid phase can retain an additional receptor that has the abilityto attract and immobilize the capture reagent. The additional receptorcan include a charged substance that is oppositely charged with respectto the capture reagent itself or to a charged substance conjugated tothe capture reagent. As yet another alternative, the receptor moleculecan be any specific binding member which is immobilized upon (attachedto) the solid support and which has the ability to immobilize thecapture reagent through a specific binding reaction. The receptormolecule enables the indirect binding of the capture reagent to a solidsupport material before the performance of the assay or during theperformance of the assay. The solid phase thus can be a plastic,derivatized plastic, magnetic or non-magnetic metal, glass or siliconsurface of a test tube, microtiter well, sheet, bead, microparticle,chip, sheep (or other suitable animal's) red blood cells, duracytes andother configurations known to those of ordinary skill in the art. Thepolynucleotides of the invention can be attached to or immobilized on asolid support individually or in groups of at least 2, 5, 8, 10, 12, 15,20, or 25 distinct polynucleotides of the invention to a single solidsupport. In addition, polynucleotides other than those of the inventionmay be attached to the same solid support as one or more polynucleotidesof the invention.

The polynucleotides of the invention that are expressed or repressed inresponse to environmental stimuli such as, for example, biotic orabiotic stress or treatment with chemicals or pathogens or at differentdevelopmental stages can be identified by employing an array of nucleicacid samples, e.g., each sample having a plurality of oligonucleotides,and each plurality corresponding to a different plant gene, on a solidsubstrate, e.g., a DNA chip, and probes corresponding to nucleic acidexpressed in, for example, one or more plant tissues and/or at one ormore developmental stages, e.g., probes corresponding to nucleic acidexpressed in seed of a plant relative to control nucleic acid fromsources other than seed. Thus, genes that are upregulated ordownregulated in the majority of tissues at a majority of developmentalstages, or upregulated or downregulated in one tissue such as in seed,can be systematically identified. The probes may also correspond tonucleic acid expressed in respone to a defined treatment such as, forexample, a treatment with a variety of plant hormones or the exposure tospecific environmental conditions involving, for example, an abioticstress or exposure to light.

Specifically, labeled rice cRNA probes were hybridized to the rice DNAarray, expression levels were determined by laser scanning and then ricegenes were identified that had a particular expression pattern. The riceoligonucleotide probe array consists of probes from over 18,000 uniquerice genes, which covers approximately 40-50% of the genome. This genomearray permits a broader, more complete and less biased analysis of geneexpression.

As described herein, GeneChip® technology was utilized to discover ricegenes that are preferentially (or exclusively) expressed during thegrain filling process in specific tissues of the plant grain such as,for example, the aleurone, embryo, endosperm, seed coat, etc.

Using this approach, 461 genes were identified, the expression of whichwas specifically elevated during the grain filling process.

Consequently, the invention also deals with a method for detecting thepresence of a polynucleotide including a nucleotide sequence which issubstantially similar, and preferably has at least between 70% and 99%sequence identity to any one of SEQ ID NO: 1 to 461, 501-511, and513-641, respectively, encoding a polypeptide the expression of which isup-regulated during grain filing, or a fragment or a variant thereof, ora complementary sequence thereto in a sample, the method including thefollowing steps of:

-   -   (a) bringing into contact a nucleotide probe or a plurality of        nucleotide probes which can hybridize with polynucleotide having        a nucleotide sequence which is substantially similar, and        preferably has at least between 70% and 99% sequence identity to        any one of SEQ ID NO: 1 to 461, 501-511, and 513-641,        respectively, encoding a polypeptide the expression of which is        up-regulated during grain filling, or a fragment or a variant        thereof, or a complementary sequence thereto and the sample to        be assayed.    -   (b) detecting the hybrid complex formed between the probe and a        nucleotide in the sample.

The invention further concerns a kit for detecting the presence of apolynucleotide including a nucleotide sequence which is substantiallysimilar, and preferably has at least between 70% and 99% sequenceidentity to any one of SEQ ID NO: 1 to 461, 501-511, and 513-641,respectively, encoding a polypeptide the expression of which isup-regulated during grain filling, or a fragment or a variant thereof,or a complementary sequence thereto in a sample, the kit including anucleotide probe or a plurality of nucleotide probes which can hybridizewith a nucleotide sequence included in a polynucleotide including anucleotide sequence which is substantially similar, and preferably hasat least between 70% and 99% sequence identity to any one of SEQ ID NO:1 to 461, 501-511, and 513-641, respectively, encoding a polypeptide theexpression of which is up-regulated during grain filing, or a fragmentor a variant thereof, or a complementary sequence thereto and,optionally, the reagents necessary for performing the hybridizationreaction.

In a first preferred embodiment of this detection method and kit, thenucleotide probe or the plurality of nucleotide probes are labeled witha detectable molecule. In a second preferred embodiment of the methodand kit, the nucleotide probe or the plurality of nucleotide probes hasbeen immobilized on a substrate.

The isolated polynucleotides of the invention can be used to createvarious types of genetic and physical maps of the genome of rice orother plants. Such maps are used to devise positional cloning strategiesfor isolating novel genes from the mapped crop species. The sequences ofthe present invention are also useful for chromosome mapping, chromosomeidentification, tagging of genes that are involved in the grain fillingprocess.

The isolated polynucleotides of the invention can further be used asprobes for identifying polymorphisms associated with phenotypes ofinterest such as, for example, enhanced phosphate utilization, andhigher yield. Briefly, total DNA is isolated from an individual orisogenic line, cleaved with one or more restriction enzymes, separatedaccording to mass, transferred to a solid support, and hybridized with aprobe molecule according to the invention. The pattern of fragmentshybridizing to a probe molecule is compared for DNA from differentindividuals or lines, where differences in fragment size signals apolymorphism associated with a particular nucleotide sequence accordingto the present invention. After identification of polymorphic sequences,linkage studies can be conducted. After identification of manypolymorphisms using a nucleotide sequence according to the invention,linkage studies can be conducted by using the individuals showingpolymorphisms as parents in crossing programs. Recombinants, F₂ progenyrecombinants or recombinant inbreds, can then be analyzed using the samerestriction enzyme/hybridization procedure. The order of DNApolymorphisms along the chromosomes can be inferred based on thefrequency with which they are inherited together versus inheritedindependently. The closer together two polymorphisms occur in achromosome, the higher the probability that they are inherited together.Integration of the relative positions of polymorphisms and associatedmarker sequences produces a genetic map of the species, where thedistances between markers reflect the recombination frequencies in thatchromosome segment. Preferably, the polymorphisms and marker sequencesare sufficiently numerous to produce a genetic map of sufficiently highresolution to locate one or more loci of interest.

The use of recombinant inbred lines for such genetic mapping isdescribed for rice (Oh et al, Mol Cells 8:175 (1998); Nandi et al, MolGen Genet 255:1 (1997); Wang et al, Genetics 136:1421 (1994)), sorghum(Subudhi et al, Genome 43:240 (2000)), maize (Burr et al., Genetics118:519 (1998); Gardineret al, Genetics 134:917 (1993)), and Arabidopsis(Methods in Molecular Biology, Martinez-Zapater and Salinas, eds.,82:137-146, (1998)). However, this procedure is not limited to plantsand can be used for other organisms such as yeast or other fungi, or foroomycetes or other protistans.

The nucleotide sequences of the present invention can also be used forsimple sequence tppeat identification, also known as single sequencerepeat, (SSR) mapping. SSR mapping in rice has been described by Miyaoet al. (DNA Res 3:233 (1996)) and Yang et al. (Mol Gen Genet 245:187(1994)), and in maize by Ahn et al. (Mol Gen Genet 241:483 (1993)). SSRmapping can be achieved using various methods. In one instance,polymorphisms are identified when sequence specific probes flanking anSSR contained within an sequence of the invention are made and used inpolymerase chain reaction (PCR) assays with template DNA from two ormore individuals or, in plants, near isogenic lines. A change in thenumber of tandem repeats between the SSR-flanking sequence producesdifferently sized fragments (U.S. Pat. No. 5,766,847). Alternatively,polymorphisms can be identified by using the PCR fragment produced fromthe SSR-flanking sequence specific primer reaction as a probe againstSouthern blots representing different individuals (Refseth et al.,Electrophoresis 18:1519 (1997)). Rice SSRs were used to map a molecularmarker closely linked to a nuclear restorer gene for fertility in riceas described by Akagi et al. (Genome 39:205 (1996)).

The nucleotide sequences of the present invention can be used toidentify and develop a variety of microsatellite markers, including theSSRs described above, as genetic markers for comparative analysis andmapping of genomes. The nucleotide sequences of the present inventioncan be used in a variation of the SSR technique known as inter-SSR(ISSR), which uses microsatellite oligonucleotides as primers to amplifygenomic segments different from the repeat region itself (Zietkiewicz etal., Genomics 20:176 (1994)). ISSR employs oligonucleotides based on asimple sequence repeat anchored or not at their 5′- or 3′-end by two tofour arbitrarily chosen nucleotides, which triggers site-specificannealing and initiates PCR amplification of genomic segments which areflanked by inversely orientated and closely spaced repeat sequences. Inone embodiment of the present invention, microsatellite markers derivedfrom the nucleotide sequences disclosed in the Sequence Listing, orsubstantially similar sequences or allelic variants thereof, may be usedto detect the appearance or disappearance of markers indicating genomicinstability as described by Leroy et al. (Electron. J. Biotechnol, 3(2),at http://www.ejb.org (2000)), where alteration of a fingerprintingpattern indicated loss of a marker corresponding to a part of a geneinvolved in the regulation of cell proliferation. Microsatellite markersderived from nucleotide sequences as provided in the Sequence Listingwill be useful for detecting genomic alterations such as the changeobserved by Leroy et al. (Electron. J Biotechnol, 3(2), supra (2000))which appeared to be the consequence of microsatellite instability atthe primer binding site or modification of the region between themicrosatellites, and illustrated somaclonal variation leading to genomicinstability. Consequently, the nucleotide sequences of the presentinvention are useful for detecting genomic alterations involved insomaclonal variation, which is an important source of new phenotypes.

In addition, because the genomes of closely related species are largelysyntenic (that is, they display the same ordering of genes within thegenome), these maps can be used to isolate novel alleles from wildrelatives of crop species by positional cloning strategies. This sharedsynteny is very powerful for using genetic maps from one species to mapgenes in another. For example, a gene mapped in rice providesinformation for the gene location in maize and wheat.

The various types of maps discussed above can be used with thenucleotide sequences of the invention to identify Quantitative TraitLoci (QTLs) for a variety of uses, including marker-assisted breeding.Many important crop traits are quantitative traits and result from thecombined interactions of several genes. These genes reside at differentloci in the genome, often on different chromosomes, and generallyexhibit multiple alleles at each locus. Developing markers, tools, andmethods to identify and isolate the QTLs involved regulating the contentand composition of the plant grain, enables marker-assisted breeding toenhance the nutritional value of the grain or suppress undesirabletraits that interfere with an efficient grain filling process. Thenucleotide sequences as provided in the Sequence Listing can be used togenerate markers, including single-sequence repeats (SSRs) andmicrosatellite markers for QTLs and utilization to assistmarker-assisted breeding. The nucleotide sequences of the invention canbe used to identify QTLs regulating the grain filling process andisolate alleles as described by Li et al. in a study of QTLs involved inresistance to a pathogen of rice. (Li et al., Mol Gen Genet 261:58(1999)). In addition to isolating QTL alleles in rice, other cereals,and other monocot and dicot crop species, the nucleotide sequences ofthe invention can also be used to isolate alleles from the correspondingQTL(s) of wild relatives. Transgenic plants having various combinationsof QTL alleles can then be created and the effects of the combinationsmeasured. Once an ideal allele combination has been identified, cropimprovement can be accomplished either through biotechnological means orby directed conventional breeding programs. (Flowers et al., J Exp Bot51:99 (2000); Tanksley and McCouch, Science 277:1063 (1997)).

In another embodiment the nucleotide sequences of the invention can beused to help create physical maps of the genome of maize, Arabidopsisand related species. Where the nucleotide sequences of the inventionhave been ordered on a genetic map, as described above, then thenucleotide sequences of the invention can be used as probes to discoverwhich clones in large libraries of plant DNA fragments in YACs, PACs,etc. contain the same nucleotide sequences of the invention or similarsequences, thereby facilitating the assignment of the large DNAfragments to chromosomal positions. Subsequently, the large BACs, YACs,etc. can be ordered unambiguously by more detailed studies of theirsequence composition and by using their end or other sequence to findthe identical sequences in other cloned DNA fragments (Mozo et al., NatGenet 22:271 (1999)). Overlapping DNA sequences in this way allowsassembly of large sequence contigs that, when sufficiently extended,provide a complete physical map of a chromosome. The nucleotidesequences of the invention themselves may provide the means of joiningcloned sequences into a contig, and are useful for constructing physicalmaps.

In another embodiment, the nucleotide sequences of the present inventionmay be useful in mapping and characterizing the genomes of othercereals. Rice has been proposed as a model for cereal genome analysisHavukkala, Curr Opin Genet Devel 6:711 (1996)), based largely on itssmaller genome size and higher gene density, combined with theconsiderable conserved gene order among cereal genomes (Ahn et al., MolGen Genet 241:483 (1993)). The cereals demonstrate both generalconservation of gene order (synteny) and considerable sequence homologyamong various cereal gene families. This suggests that studies on thefunctions of genes or proteins from rice according to the presentinvention could lead to elucidation of the functions of orthologousgenes or proteins in other cereals, including maize, wheat, secale,sorghum, barley, millet, teff, milo, triticale, flax, gramma grass,Tripsacum sp., and teosinte. The nucleotide sequences according to theinvention can also be used to physically characterize homologouschromosomes in other cereals, as described by Sarma et al. (Genome43:191 (2000)), and their use can be extended to non-cereal monocotssuch as sugarcane, grasses, and lilies.

Given the synteny between rice and other cereal genomes, the nucleotidesequences of the present invention can be used to obtain molecularmarkers for mapping and, potentially, for positional cloning. Kilian etal. described the use of probes from the rice genomic region of interestto isolate a saturating number of polymorphic markers in barley, whichwere shown to map to syntenic regions in rice and barley, suggestingthat the nucleotide sequences of the invention derived from the ricegenome would be useful in positional cloning of syntenic grain-fillinggenes of interest from other cereal species. (Kilian, et al., Nucl AcidsRes 23:2729 (1995); Kilian, et al, Plant Mol Biol 35:187 (1997)).Synteny between rice and barley has recently been reported in the areaof the carrying malting quality QTLs (Han, et al., Genome 41:373(1998)), and use of synteny between cereals for positional cloningefforts is likely to add considerable value to rice genome analysis.Likewise, mapping of the ligules region of sorghum was facilitated usingmolecular markers from a syntenic region of the rice genome. (Zwick, etal., Genetics 148:1983 (1998)).

Rice marker technology utilizing the nucleotide sequences of the presentinvention can also be used to identify QTL alleles from a wild relativeof cultivated rice, for example as described by Xiao, et al. (Genetics150:899 (1998)). Wild relatives of domesticated plants representuntapped pools of genetic resources for abiotic and biotic stressresistance, apomixis and other breeding strategies, plant architecture,determinants of yield, secondary metabolites, and other valuable traits.In rice, Xiao et al. (supra) used molecular markers to introduce anaverage of approximately 5% of the genome of a wild relative, and theresulting plants were scored for phenotypes such as plant height,panicle length and 1000-grain weight. Trait-improving alleles were foundfor all phenotypes except plant height, where any change is considerednegative. Of the 35 trait-improving alleles, Xiao et al. found that 19had no effect on other phenotypes whereas 16 had deleterious effects onother traits. The nucleotide sequences of the invention such as thoseprovided in the Sequence Listing can be employed as molecular markers toidentify QTL alleles involved in the regulation of the grain fillingprocess from a wild relative, by which these valuable traits can beintrogressed from wild relatives using methods including, but notlimited to, that described by Xiao et al. ((1998) supra). Accordingly,the nucleotide sequences of the invention can be employed in a varietyof molecular marker technologies for yield improvement.

Following the procedures described above to identify polymorphisms, andusing a plurality of the nucleotide sequences of the invention, anyindividual (or line) can be genotyped. Genotyping a large number of DNApolymorphisms such as single nucleotide polymorphisms (SNPs), inbreeding lines makes it possible to find associations between certainpolymorphisms or groups of polymorphisms, and certain phenotypes. Inaddition to sequence polymorphisms, length polymorphisms such as tripletrepeats are studied to find associations between polymorphism andphenotype. Genotypes can be used for the identification of particularcultivars, varieties, lines, ecotypes, and genetically modified plantsor can serve as tools for subsequent genetic studies of complex traitsinvolving multiple phenotypes.

The patent publication WO95/35505 and U.S. Pat. Nos. 5,445,943 and5,410,270 describe scanning multiple alleles of a plurality of lociusing hybridization to arrays of oligonucleotides. The nucleotidesequences of the invention are suitable for use in genotyping techniquesuseful for each of the types of mapping discussed above.

In a preferred embodiment, the nucleotide sequences of the invention areuseful for identifying and isolating a least one unique stretch ofprotein-encoding nucleotide sequence. The nucleotide sequences of theinvention are compared with other coding sequences having sequencesimilarity with the sequences provided in the Sequence Listing, using aprogram such as BLAST. Comparison of the nucleotide sequences of theinvention with other similar coding sequences permits the identificationof one or more unique stretches of coding sequences encodingpolypeptides that are up-regulated during grain filling that are notidentical to the corresponding coding sequence being screened.Preferably, a unique stretch of coding sequence of about 25 base pairs(bp) long is identified, more preferably 25 bp, or even more preferably22 bp, or 20 bp, or yet even more preferably 18 bp or 16 bp or 14 bp. Inone embodiment, a plurality of nucleotide sequences is is screened toidentify unique coding sequences accroding to the invention. In oneembodiment, one or more unique coding sequences accroding to theinvention can be applied to a chip as part of an array, or used in anon-chip array system. In a further embodiment, a plurality of uniquecoding sequences accroding to the invention is used in a screeningarray. In another embodiment, one or more unique coding sequencesaccroding to the invention can be used as immobilized or as probes insolution. In yet another embodiment, one or more unique coding sequencesaccroding to the invention can be used as primers for PCR. In a furtherembodiment, one or more unique coding sequences accroding to theinvention can be used as organism specific primers for PCR in a solutioncontaining DNA from a plurality of sources.

In another embodiment unique stretches of nucleotide sequences accordingto the invention are identified that are preferably about 30 bp, morepreferably 50 bp or 75 bp, yet more preferably 100 bp, 150 bp, 200 bp,250, 500 bp, 750 bp, or 1000 bp. The length of an unique coding sequencemay be chosen by one of skill in the art depending on its intended useand on the characteristics of the nucleotide sequence being used. In oneembodiment, unique coding sequences accroding to the invention may beused as probes to screen libraries to find homologs, orthologs, orparalogs. In another embodiment, unique coding sequences accroding tothe invention may be used as probes to screen genomic DNA or cDNA tofind homologs, orthologs, or paralogs. In yet another embodiment, uniquecoding sequences accroding to the invention may be used to study geneevolution and genome evolution.

EXAMPLES

The invention will be further described by reference to the followingdetailed examples. These examples are provided for purposes ofillustration only, and are not intended to be limiting unless otherwisespecified. Standard recombinant DNA and molecular cloning techniquesused here are well known in the art and are described in detail inSambrook et al. (Molecular Cloning: A Laboratory Manual, Cold SpringHarbor Laboratory Press (1989)) and by Ausubel et al. (Current Protocolsin Molecular Biology, Greene Publishing (1992)).

Example 1 Isolation and Sequencing of DNA Fragments

1.1 Isolation and Sequencing of Genomic DNA Fragments

Genomic DNA was isolated from nuclei of Oryza sativa L. ssp japonica cvNipponbare and then sheared to produce fragments of approximately 500bp. Using a method derived from the method of Mao et al. (Genome Res10:982 (2000)), seeds were germinated on cheese cloth immersed in waterand grown for 4-6 weeks under greenhouse conditions. After plantsreached a height of approximately 5-8 inches, the upper parts of thegreen leaves were harvested and wrapped in aluminum foil at 4° C.overnight. Leaf material was then stored at −80° C. or directly used forextraction of nuclei. Intact nuclei were isolated by homogenization (ina blender for flesh material or by grinding with mortar and pestle forfrozen material) in a buffer containing 10 mM Trizra base, 80 mM KCl, 10mM EDTA, 1 mM spermidine, 1 mM spermine, 0.5 M sucrose, 0.5%Triton-X-100, 0.15% β-mercaptoethanol pH 9.5. The homogenate wasfiltered and nuclei recovered by gentle centrifugation using afixed-angle rotor at 1,800 g at 4 C for 20 minutes. The pellet recoveredafter centrifugation was gently resuspended with the assistance of asmall paint brush soaked in ice cold wash buffer and wash buffer added.Particulate matter remaining in the suspension was removed by filteringthe resuspended nuclei into a 50 ml centrifuge tube through two layersof miracloth by gravity and centrifuging the filtrate at 57 g (500 rpm),4 C for 2 minutes to remove intact cells and tissue residues. Thesupernatant fluid was transferred into a fresh centrifuge tube andnuclei were pelleted by centrifugation at 1,800 g, 4 C for 15 minutes ina swinging bucket centrifuge.

DNA was isolated from the nuclear preparation by phenolchloroformextraction, as in Sambrook et al (supra). Isolated total genomic DNA wasphysically sheared (Hydroshear) to generate for generating random DNAfragments, and fragments of approximately 500 bp were recovered. DNA waseluted and the ends filled in using T₄ DNA polymerase, Klenow fragments,and dNTPs. Double-stranded DNA was Tinkered and cloned into a Novartisproprietary medium-copy vector derived from pSC101.

Vector inserts were amplified by PCR and sequenced using the MegaBACEsequencing system (Molecular Dynamics, Amersham). The amplificationreaction was diluted before use and was not purified using anexonuclease/alkaline phosphatase procedure. Sequencing reactions wereperformed using DYEnamic ET Terminator Kit. The reactions containedapproximately 50 ng of amplicon, DYEnamic ET Terminator premix, and 5pmol of −40 M13 forward primer. The sequencing reaction is amplified for30 cycles, and reaction products are concentrated and purified usingethanol precipitation. The sample was electrokinetically injected intothe capillary at 3 kV for 45 sec and separated via electrophoresis at 9kV for 120 min.

1.2 Isolation and Sequencing of cDNA Fragments

Construction of rice cDNA library. Total RNA was purified from riceplant tissue using standard total RNA purification methods. PolyA+ RNAwas isolated from the total RNA using the Qiagen Oligotex mRNApurification system (Qiagen, Valencia, Calif.), and cDNA was generatedusing cDNA synthesis reagents from Life Technologies (Rockville, Md.).First strand cDNA synthesis was catalyzed by reverse transcriptase usingoligo dT primers with a NotI restriction site. Second strand synthesiswas catalyzed by DNA polymerase. An oligonucleotide linker with a SalIrestriction endonuclease site was attached to the 5′ end of the cDNAsusing DNA ligase. The cDNAs were digested with NotI and SalI restrictionendonucleases and inserted into an E. coli-replicating plasmid harboringa selectable marker. E. coli was transfected with the recombinantplasmids and grown on selectable media. E. coli colonies wereindividually picked off the selectable media and placed into storageplates.

Sequencing the rice cDNA library, The DNA sequence of the cDNA clonedinto the plasmid purified from an E. coli colony was determined usingstandard dideoxy sequencing methods. Oligonucleotide primersrespectively corresponding to plasmid DNA regions upstream of the 5′ endof the cDNA insert (Forward reaction) and downstream of the 3′ end ofthe cDNA insert (Reverse reaction) were used in the dideoxy sequencingreactions. If the DNA sequence determined as a result of the Forward andReverse reactions from the cDNA overlapped, the two sequences could bemerged into a contig using computerized analysis software (Consed,University of Washington, Seattle), to assemble a full-length sequenceof the cDNA. In cases case where DNA sequence from the Forward andReverse reactions from a single clone did not overlap sufficiently to beassembled into a contig, such that there was a region of unsequenced DNAto bridge the DNA from the Forward and Reverse reaction in order to forma contig, the DNA sequence of the separating region was determined usingone of two dideoxy sequencing methods. In a “primer walking” approach, aprimer specifically corresponding to the 3′ end of the DNA sequencedetermined from the Forward reaction was used in a second dedeoxysequencing reaction. The primer walking procedure was repeated until theDNA sequence that separated the original Forward and Reverse wasresolved and a contig could be assembled. Alternatively, the cloneharboring the cDNA was subjected to transposon in vitro insertiondideoxysequencing (Epicentre, Madison, Wis.). In this procedure, theinsertion process was random and the result was multiple DNA sequencecoverage over the targeted cDNA, where the sequences thus obtained wereassembled into a contig.

Example 2 GeneChip® Standard Protocol

The standard protocol for using the GeneChip® to quantitatively measureplant gene expression was carried out as outlined below:

Quantitation of total RNA

30 Total RNA from plant tissue was extracted and quantified. Quantifiedtotal RNA using

-   -   GeneQuant    -   IOD₂₆₀=40 mg RNA/ml; A₂₆₀/A₂₈₀=1.9 to about 2.1

2: Ran gel to check the integrity and purity of the extracted RNA

Synthesis of Double-Stranded cDNA

Gibco/BRL SuperScript Choice System for cDNA Synthesis (Cat#IB090-019)was employed to prepare cDNAs. T7-(dT)₂₄ oligonucleotides were preparedand purified by HPLC. SEQ ID NO: 4709) (5′-GGCCAGTGAATTGTAATACGACTCACTATAGGGAGGCGG- (dT)₂₄-3′;.

Step 1. Primer Hybridization:

-   -   Incubated at 70° C. for 10 minutes    -   Spun quickly and put on ice briefly

Step 2. Temperature Adjustment:

-   -   Incubated at 42° C. for 2 minutes

Step 3. First Strand Synthesis Carried Out Using:

-   -   DEPC-water—1:1    -   RNA (10:g final)—10:1    -   T7=(dT)₂₄ Primer (100 pmol final)—1:1 pmol    -   5× 1^(st) strand cDNA buffer—4:1    -   0.1 M DTT(10 mM final)—2:1    -   10 mM dNTP mix (500:M final)—1:1    -   Superscript II RT 200 U/:l—1:1    -   Total of 20:1    -   Mixed well    -   Incubated at 42° C. for 1 hour

Step 4. Second Strand Synthesis:

-   -   Placed reactions on ice, quick spin        -   DEPC-water—91:1        -   5× 2^(nd) strand cDNA buffer—30:1        -   10 mM dNTP mix (250 mM final)—3:1        -   E. coli DNA ligase (10U/:1)—1:1        -   E. coli DNA polymerase 1-10 U/:l—4:1        -   RnaseH 2U/:l—1:1        -   T4 DNA polymerase 5 U/:l—2:1        -   0.5 M EDTA (0.5 M final)—10:1        -   Total 162:1    -   Mixed/spun down/incubated 16° C. for 2 hours

Step 5. Completing the Reaction:

-   -   Incubated at 16° C. for 5 minutes        Purification of Double Stranded cDNA    -   1. Centrifuged PLG (Phase Lock Gel, Eppendorf 5 Prime Inc.,        pl-188233) at 14,000×, transfered 162:1 of cDNA to PLG    -   2. Added 162:1 of Phenol:Chloroform:Isoamyl alcohol (pH 8.0),        centrifuge 2 minutes

3. Transfered the supernatant to a fresh 1.5 ml tube, add Glycogen (5mg/ml) 2 0.5 M NR₄OAC (0.75 × Vol) 120 ETOH (2.5 × Vol, −20° C.) 400

-   -   4. Mixed well and centrifuge at 14,000× for 20 minutes    -   5. Removed supernatant, added 0.5 ml 80% EtOH (−20° C.)    -   6. Centrifuged for 5 minutes, air dry or by speed vac for 5-10        minutes    -   7. Added 44:1 DEPC H₂O        Analyzed quantity and size distribution of cDNA

Ran a gel using 1:1 ratio of the double-stranded synthesis product toloading Buffer Synthesis of biotinylated cRNA (used Enzo BioArray HighYield RNA Transcript Labeling Kit Cat#900182) Purified cDNA 22:1  10× Hybuffer 4:1 10× biotin ribonucleotides 4:1 10× DTT 4:1 10× Rnaseinhibitor mix 4:1 20× T7 RNA polymerase 2:1 Total 40:1 

-   -   Centrifuged 5 seconds, and incubated for 4 hours at 37° C.

Gently mixed every 30-45 minutes Purification and quantification of cRNA(used Qiagen Rneasy Mini kit Cat# 74103) cRNA 40:1 DEPC H₂O 60:1 RLTbuffer 350:1 mix by vortexing EtOH 250:1 mix by pipetting Total 700:1Waited 1 minute or more for the RNA to stick

Centrifuged at 2000 rpm for 5 minutes RPE buffer 500:1

Centrifuged at 10,000 rpm for 1 minute RPE buffer 500:1Centrifuged at 10,000 rpm for 1 minute

Centrifuged at 10,000 rpm for 1 minute to dry the column DEPC H₂O 30:1

Waited for 1 minute, then elute cRNA from by centrifugation, 10 K 1minute DEPC H₂O 30:1Repeated previous step

Determined concentration and dilute to 1:g/:l concentrationFragmentation of cRNA cRNA (1:g/:1) 15:1 5× Fragmentation Buffer*  6:1DEPC H₂O  9:1 30:1

*5× Fragmentation Buffer 1 M Tris (pH8.1) 4.0 ml MgOAc 0.64 g KOAC 0.98g DEPC H₂O Total 20 ml Filter SterilizeArray washed and stained in:

-   -   Stringent Wash Buffer**    -   Non-Stringent Wash Buffer**    -   SAPE Stain****    -   Antibody Stain*****        Washed on Fluidics Station Using the Appropriate Antibody        Amplification Protocol    -   **Stringent Buffer. 12×MES 83.3 ml, 5 M NaCl 5.2 ml, 10% Tween        1.0 ml, H₂O 910 ml,        -   Filter Sterilize    -   ***Non-Stringent Buffer. 20×SSPE 300 ml, 10% Tween 1.0 ml, H₂O        698 ml, Filter Sterilize, Antifoam 1.0.        -   ****SAPE stain: 2× Stain Buffer 600:1, BSA 48:1, SAPE 12:1,            H₂O 540:1.    -   *****Antibody Stain: 2× Stain Buffer 300:1, H₂O 266.4:1, BSA        24:1, Goat IgG 6:1, Biotinylated Ab 3.6:1

Example 3 Profiling of Genes Involved in Nutrition partitioning DuringGrain Development

A GeneChip® Rice Genome Array (Affymetrix, Santa Clara, Calif.) was usedto examine how accumulation of carbohydrates, storage protein and fattyacids is coordinated at RNA level during grain development.

RNA expression of three major pathways and associated genes involvingnutrition partitioning was examined, including synthesis and transportof carbohydrates, proteins, and fatty acids. A total of 491 genesinvolved in these pathways were first selected based on their sequenceannotation and functional classification. RNA expression was determinedin 39 samples representing different developmental stages includingsamples collected before and during grain filling.

3.1 Plant Growth Conditions and Sampling

Nipponbare rice was grown in the greenhouse with 12 hr light cycle andtemperature of 29° C. during the day and 21° C. during the night.Humidity was maintained at 30%. Plants were grown in pots containing 50%sunshine mix and 50% nitrohumus. The descriptions of the samplescollected for this analysis are listed in table 1. Individual tissueswere collected from a minimum of five plants and pooled. Total RNA wasextracted from one gram of tissue using the Qiagen RNA Easy Maxikit(Qiagen, Valencia, Calif.).

The experiments were carried out as described in T. Zhu e al. PlantPhysiol. Biochem. 39, 221 (2001). TABLE 1 Rice samples included in thestudy of genes involved in nutrition partitioning during graindevelopment Days after developmental Description germination stage RankCategory germinating 5 11 1 root seedling (root) germinating 5 12 1 leafseedling [LEAF] 3-4 leaf arial 18 13 2 arial tillering 49 14 3 rootstage (root) tillering 49 15 3 leaf stage (leaf) tillering 49 16 3 arialstage (arial) Booting Stage 60 17 4 repr panicle 1-3 cm Booting stage 6218 5 repr panicle 4-7 cm Booting Stage 64 19 6 repr panicle 8-14 cmBooting Stage 66 20 7 repr panicle 15-20 cm Booting Stage root 60 22 6root Booting Stage leaf 60 23 6 leaf Booting stage arial 60 24 6 arialpanicle emergence- 78 25 8 root root panicle emergence - 78 26 8 stemstem panicle emergence- 78 21 8 repr panicle Seed milk 88 39 repr stage[˜9DAF] Seed -soft 94 40 14 repr dough [˜14DAF] Seed hard 100 41 15 reprdough [˜21DAF] inflorescence- no seeds 88 30 9 repr maturation stem 9027 15 stem maturation root 90 28 15 root maturation leaf 90 29 15 leafembryo 88 42 14 embryo endosperm 88 43 14 endospm seed coat 88 44 14coat Senescence -stem 100 31 16 stem Senescence [LEAF] 100 32 16 leafaleurone 88 45 14 aleurone pollen mixed 55 33 pollen seed day 0 79 34 9repr post anthesis seed day 2 81 35 10 repr post anthesis seed day 4 8336 11 repr post anthesis seed day 7 86 37 12 repr post anthesis seed day8 87 38 13 repr post anthesis

Example 4 Characterization of Gene Expression Profiles

4.1 Data Analysis 1

A rice gene array and probes derived from rice RNA extracted fromdifferent tissues and developmental stages were used to identify theexpression profile of genes on the chip. The rice array contains over23,000 genes (approximately 18,000 unique genes) or roughly 50% of therice genome and is similar to the Arabidopsis GeneChip®) (Affymetrix)with the exception that the 16 oligonucleotide probe sets do not containmismatch probe sets. The level of expression is therefore determined byinternal software that analyzes the intensity level of the 16 probe setsfor each gene. The highest and lowest probes are removed if they do notfit into a set of predefined statistical criteria and the remaining setsare averaged to give an expression value. The final expression valuesare normalized by software, as described below. The advantages of a genechip in such an analysis include a global gene expression analysis,quantitative results, a highly reproducible system, and a highersensitivity than Northern blot analyses.

4.2 Data Analysis II

Data analysis was done using GeneSpring (Silicon Genetics, Redwood,Calif.) and AlignAce. The genechip sequence was blasted to the AC ricecontig sequences. The contig with the best alignment was extracted andfive gene prediction programs were run on each contig. The programs usedwere Genscan trained on arabidopsis and maize, Gmhmm trained on rice andArabidopsis, and Fgenesh and Glimmer trained on rice. All of thepredicted CDSs were blasted against the genechip sequence again toextract the top hit predicted CDS. A Perl script was utilized to extractup to 2 kb of the putative promoter sequence. In some of the genechipsequences there was more than one perfect alignment to a predicted CDS;in such cases, both of the perfect alignments were accepted as theputative genes. TABLE 2 Table 2 provides provides a subset of rice genesthe expression of which is up-regulated during grain filling. Furtheridentified are SSR sequences in the coding region of the rice genes. A =Genes involved in rice grain filling, which belong to the functionalcategory of Carborhydrate Metabolism B = Genes involved in rice grainfilling, which belong to the functional category of transmembraneproteins C = Genes involved in rice grain filling, which belong to thefunctional category of storage proteins D = Genes involved in rice grainfilling, which belong to the functional category of stress responseproteins E = 345 Grain Filling Genes F = Genes involved in rice grainfilling, which belong to the functional category of signaling moleculesG = Genes involved in rice grain filling, which belong to the functionalcategory of transcription factors H = Genes involved in rice grainfilling, which belong to the functional category of amino acidMetabolism I = Genes involved in rice grain filling, which belong to thefunctional category of Fatty Acid Metabolism J =Cereal_Grain_Filling_QTLs (a description of the respective QTLs isprovided in Table . . . below) K = Beginning of the SSR L = End of theSSR M = Nucleotide Sequence of the tri- and tetra-nucleotide repeatunits SEQ ID A B C D E F G H I J K L M 101 X — — — X — — — — 113 X — — —X — — — — 42 59 CCT 1 — — — X X — — — — 317 X — — — X — X — — 329 — — —— X — — — — OS-FLLEN-9-1, OS-GPL-4-1, OS-GPP-4-1, OS-GW100-4-1,OS-GYLD-4-1 173 X — — — X — — — — 331 — — — — X — — — — OS-GW-5-1 5 19CGG OS-YLD-5-1, ZM-MOIST-4-3, ZM-DMY-4-3, ZM-YLD-4-1 333 — — — — X — — —— 233 — — X — X — — — — 335 — — — — X — — — — 119 X — — — X — — — — 311X — — — X — X — — 358 372 CGC 661 675 CGG 149 X — — — X — — — — 337 — —— — X — — — — 59 — X — — X — — — — 339 — — — — X — — — — 155 X — — — X —— — — 1207 1221 CTG 143 X — — — X — — — — 307 — — — — X — X — — 155 175CTG 341 — — — — X — — — — 193 X — — — X — — — — SMS015-9, 1401 1415 CGTZM-MOIST-4-2, ZM-DMY-4-1 131 X — — — X — — — — 199 X — — — X — — — —OS-AE-1-1, 207 221 CGC OS-AE-5-1, OS-APDF-9-1, OS-REGEN-3-1, OS-RGT-5-1,OS-VGT-2-2, OS-VGT-5-1, OS-GC-2-1, OS-GYLD-1-1, SMS021-80, ZM-CPC-5-1,ZM-ID-5-1, ZM-IVDOM-5-1, ZM-IVDOM-5-2, ZM-IVDOM-5-3, ZM-MOIST-5-2,ZM-MOIST-5-2, ZM-MOIST-5-3, ZM-BIOM-5-1, ZM-DMC-6-2, ZM-DMY-5-1,ZM-GYLD-5-1, ZM-GYLD-5-3, ZM-GYLD-5-3, ZM-GYLD-6-4, ZM-GYLD-6-4,ZM-KW300-5-1, ZM-TW-5-1, ZM-YLD-6-1 301 — — — — X — X — — OS-VGT-2-2,OS-GC-2-1 343 — — — — X — — — — OS-FLLEN-3-1, OS-GPL-2-1, OS-GYLD-2-1,ZM-ID-5-2, ZM-MOIST-4-3, ZM-MOIST-5-4, ZM-PC-5-1, ZM-STC-5-1,ZM-DMC-5-1, ZM-DMY-4-3, ZM-GYLD-5-2 287 — — — — X — — X — 191 X — — — X— — — — 215 — — X — X — — — — 373 387 TCG 972 986 CCG 23 — — — — X X — —— ZM-MOIST-2-3, ZM-STC-2-2, ZM-DMY-2-3, ZM-DMY-2-4, ZM-GYLD-2-3 147 X —— — X — — — — 345 — — — — X — — — — 347 X — — — X — — — — OS-GPDF-1-1,SMS015-16, ZM-CL-9-1, ZM-CPC-3-1, ZM-CPC-3-3, ZM-CPC-8-1, ZM-ID-8-1,ZM-ID-8-1, ZM-ID-8-1, ZM-IVDOM-3-1, ZM-IVDOM-3-3, ZM-MOIST-8-1,ZM-MOIST-8-2, ZM-MOIST-9-2, ZM-PC-8-1, ZM-PC-9-1, ZM-PR-9-1, ZM-STC-8-1,ZM-BIOM-8-1, ZM-DMC-8-1, ZM-DMC-8-2, ZM-DMY-3-2, ZM-DMY-3-3, ZM-DMY-8-1,ZM-DMY-8-2, ZM-GWE-9-1, ZM-GWM2-3-1, ZM-GYHA-8-1, ZM-GYLD-8-2,ZM-GYLD-9-1, ZM-HI-3-1, ZM-HI-8-1, ZM-KW100-9-1, ZM-KW300-3-2,ZM-KW300-8-2, ZM-KW300-9-2, ZM-TGW-9-1, ZM-TW-8-1, ZM-YLD-9-1,ZM-YLD-9-1 157 X — — — X — — — — MAS24-2, 126 140 CCT ZM-CPC-1-4,ZM-CPC-1-6, ZM-MOIST-4-3, ZM-MOIST-7-3, ZM-MOIST-7-4, ZM-MOIST-9-2,ZM-MOIST-9-2, ZM-PC-9-1, ZM-BIOM-3-1, ZM-DMC-1-2, ZM-DMY-1-3,ZM-DMY-1-5, ZM-DMY-4-3, ZM-GWM2-3-2, ZM-GYLD-3-3, ZM-GYLD-9-1,ZM-GYUI-9-1, ZM-GYUI-9-2, ZM-GYUP-9-2, ZM-KW100-9-1, ZM-KW300-9-1,ZM-KW300-9-2, ZM-YLD-9-1 349 — — — — X — — — — 139 X — — — X — — — — 175X — — — X — — — — 5 — — — X X — — — — 351 — — — — X — — — — 353 X — — —X — — — — 309 — — — — X — X — — OS-RGT-2-1, 378 392 CAA OS-VGT-2-1 355 —— — — X — — — — 255 — — — — X — — — X OS-GW-9-1, MAS13-24, MAS13-31,ZM-CPC-1-3, ZM-CPC-1-5, ZM-CPC-7-2, ZM-CPC-7-3, ZM-IVDOM-1-2,ZM-IVDOM-1-4, ZM-MOIST-1-4, ZM-MOIST-1-5, ZM-MOIST-7-1, ZM-MOIST-7-2,ZM-PC-1-1, ZM-STC-7-2, ZM-BIOM-7-1, ZM-DMC-1-1, — ZM-DMY-1-4,ZM-GWM2-7-1, ZM-GYLD-7-3, ZM-GYUP-1-2, ZM-HI-7-1, ZM-KW300-1-2,ZM-TW-1-1 75 X — — — X — — — — 357 — — — — X — — — — 359 — — — — X — — —— OS-GW-5-1, OS-YLD-5-1, ZM-MOIST-4-3, ZM-DMY-4-3, ZM-YLD-4-1 361 — — —— X — — — — 363 — — — — X — — — — OS-GW-3-1, ZM-CPC-1-2, ZM-IVDOM-1-1,ZM-IVDOM-9-1, ZM-IVDOM-9-2, ZM-MOIST-1-2, ZM-MOIST-1-2, ZM-MOIST-9-3,ZM-DMY-9-1, ZM-GYHA-1-3, ZM-GYHA-1-4, ZM-GYLD-1-1, ZM-GYLD-9-2,ZM-GYLD-9-2, ZM-GYUP-1-1, ZM-GYUP-1-1, ZM-HI-1-1, ZM-KW100-1-2,ZM-KW100-9-1, ZM-TGW-9-2, ZM-TW-9-1, ZM-YLD-1-1 365 — — — — X — — — —181 X — — — X — — — — 367 — — — — X — — — — 261 — — — — X — — — X 221 —— X — X — — — — 57 — X — — X — — — — 25 — — — — X X — — — 1047 1061 CGC369 — — — — X — — — — OS-CHALK-10-1, ZM-MOIST-2-3, ZM-DMY-2-3,ZM-GYLD-2-3 39 — X — — X — — — — 87 X — — — X — — — — OS-APDF-9-1, 30 44CCT MAS13-24, 1391 1411 CCG ZM-CPC-1-3, ZM-CPC-1-5, ZM-IVDOM-1-2,ZM-IVDOM-1-4, ZM-MOIST-1-4, ZM-MOIST-1-5, ZM-MOIST-2-3, ZM-PC-1-1,ZM-STC-2-2, ZM-DMC-1-1, ZM-DMY-1-1, ZM-DMY-2-3, ZM-DMY-2-4, ZM-GYLD-2-1,ZM-GYLD-2-3, ZM-GYUP-1-2, ZM-KW300-1-2, ZM-TW-1-1, ZM-YLD-2-1,ZM-YLD-2-2 371 — — — — X — — — — 163 X — — — X — — — — 373 — — — — X — —— — 313 — — — — X — X — — OS-GW-5-1, OS-YLD-5-1 375 — — — — X — — — —315 X — — — X — X — — OS-GPL-4-1, 683 703 CCG OS-GPP-4-1, OS-GYLD-4-1,MAS24-2, ZM-CPC-3-2, ZM-ID-10-1, ZM-ID-2-1, ZM-MOIST-10-1, ZM-MOIST-2-2,ZM-MOIST-3-2, ZM-MOIST-9-2, ZM-PC-9-1, ZM-STC-10-1, ZM-BIOM-3-1,ZM-DMC-10-1, ZM-DMC-10-2, ZM-DMC-2-3, ZM-DMY-10-1, ZM-DMY-3-1,ZM-EWT-2-1, ZM-GWM2-10-1, ZM-GWM2-3-2, ZM-GYHA-3-1, ZM-GYLD-2-2,ZM-GYLD-3-3, ZM-GYUI-9-1, ZM-GYUI-9-2, ZM-GYUP-9-2, ZM-HI-10-1,ZM-KW300-3-3, ZM-KW300-9-1, ZM-KW300-9-2, ZM-TW-10-2, ZM-TW-2-3 89 X — —— X — — — — 377 — — — — X — — — — 289 — — — — X — — X — 49 — X — — X — —— — 153 X X — — X — — — — 81 X — — — X — — — — 379 — — — — X — — — — 707721 CGC 882 902 GGA 305 — — — — X — X — — OS-BDV-1-1, OS-CHALK-1-1,OS-CPV-1-1, OS-CSV-1-1, OS-SBV-1-1, OS-GP-1-1, OS-GW-1-2, OS-YLD-1-1,ZM-MOIST-1-1, ZM-MOIST-1-2, ZM-GYHA-1-2, ZM-GYHA-1-3, ZM-GYUP-1-1,ZM-HI-1-1, ZM-KW100-1-2 381 — — — — X — — — — OS-GPL-4-1, OS-GPP-4-1,OS-GYLD-4-1, MAS24-2, ZM-CPC-3-2, ZM-ID-10-1, ZM-ID-2-1, ZM-MOIST-10-1,ZM-MOIST-2-2, ZM-MOIST-3-2, ZM-MOIST-9-2, ZM-PC-9-1, ZM-STC-10-1,ZM-BIOM-3-1, ZM-DMC-10-1, ZM-DMC-10-2, ZM-DMC-2-3, ZM-DMY-10-1,ZM-DMY-3-1, ZM-EWT-2-1, ZM-GWM2-10-1, ZM-GWM2-3-2, ZM-GYHA-3-1,ZM-GYLD-2-2, ZM-GYLD-3-3, ZM-GYUI-9-1, ZM-GYUI-9-2, ZM-GYUP-9-2,ZM-HI-10-1, ZM-KW300-3-3, ZM-KW300-9-1, ZM-KW300-9-2, ZM-TW-10-2,ZM-TW-2-3 197 X — — — X — — — — 45 — X — — X — — — — 97 X — — — X — — —— 383 — — — — X — — — — 135 X — — — X — — — — 267 X — — — X — — — X 217234 CCG 385 — — — — X — — — — 90 107 CGG 575 592 CCG 33 — X — — X — — —— 283 — — — — X — — X — 391 408 CGG 53 — X — — X — — — — 253 — — — — X —— — X 387 — — — — X — — — — 295 — — — — X — — X — OS-GPL-4-1,OS-GPP-4-1, OS-GYLD-4-1, MAS24-2, ZM-CPC-3-2, ZM-ID-10-1, ZM-ID-2-1,ZM-MOIST-10-1, ZM-MOIST-2-2, ZM-MOIST-3-2, ZM-MOIST-9-2, ZM-PC-9-1,ZM-STC-10-1, ZM-BIOM-3-1, ZM-DMC-10-1, ZM-DMC-10-2, ZM-DMC-2-3,ZM-DMY-10-1, ZM-DMY-3-1, ZM-EWT-2-1, ZM-GWM2-10-1, ZM-GWM2-3-2,ZM-GYHA-3-1, ZM-GYLD-2-2, ZM-GYLD-3-3, ZM-GYUI-9-1, ZM-GYUI-9-2,ZM-GYUP-9-2, ZM-HI-10-1, ZM-KW300-3-3, ZM-KW300-9-1, ZM-KW300-9-2,ZM-TW-10-2, ZM-TW-2-3 389 — — — — X — — — — 225 — — X — X — — — — 391 —— — — X — — — — 167 X — — — X — — — — OS-GW-3-1, MAS19-14, SMS021-79,ZM-CL-9-1, ZM-CPC-1-2, ZM-CPC-6-2, ZM-ID-8-1, ZM-ID-8-1, ZM-IVDOM-1-1,ZM-IVDOM-1-3, ZM-IVDOM-9-1, ZM-IVDOM-9-2, ZM-MOIST-1-2, ZM-MOIST-1-3,ZM-MOIST-4-3, ZM-MOIST-9-3, ZM-PC-8-1, ZM-PC-9-1, ZM-PR-9-1,ZM-BIOM-8-1, ZM-DMC-6-1, ZM-DMC-8-1, ZM-DMY-1-2, ZM-DMY-4-3, ZM-DMY-8-2,ZM-DMY-9-1, ZM-GWE-9-1, ZM-GYHA-1-1, ZM-GYHA-1-4, ZM-GYHA-8-1,ZM-GYLD-1-1, ZM-GYLD-1-2, ZM-GYLD-6-1, ZM-GYLD-6-4, ZM-GYLD-9-2,ZM-GYLD-9-2, ZM-GYUP-1-1, ZM-HI-1-1, ZM-HI-8-1, ZM-KW100-9-1,ZM-KW300-8-2, ZM-TGW-9-1, ZM-TGW-9-2, ZM-TW-9-1, ZM-YLD-1-1, ZM-YLD-9-1137 X — — — X — — — — 393 — — — — X — — — — 195 X — — — X — — — — 263 —— — — X — — — X 41 — X — — X — — — — 303 — — — — X — X — — 223 — — X — X— — — — 85 X — — — X — — — — 395 — — — — X — — — — 129 X — — — X — — — —OS-ASS-6-2, MAS24-2, ZM-ID-5-2, ZM-MOIST-5-4, ZM-MOIST-9-2, ZM-PC-5-1,ZM-PC-9-1, ZM-STC-5-1, ZM-DMC-5-1, ZM-GYLD-5-2, ZM-GYUI-9-1,ZM-GYUI-9-1, ZM-GYUI-9-2, ZM-GYUP-9-1, ZM-GYUP-9-2, ZM-KW300-9-1,ZM-KW300-9-2 103 X — — — X — — — — 51 — X — — X — — — — 99 — — — — X — —— — 69 X — — — X — — — — 397 — — — — X — — — — 229 — — X — X — — — — 399— — — — X — — — — 241 — — X — X — — — — 91 X — — — X — — — — 401 — — — —X — — — — 121 X — — — X — — — — 403 — — — — X — — — — 187 X — — — X — —— — 405 — — — — X — — — — 13 — — — X X — — — — 243 — — X — X — — — — 203X — — — X — — — — 441 455 CGG 407 — — — — X — — — — 409 — — — — X — — —— 411 — — — — X — — — — 243 260 CAG 105 X — — — X — — — — 107 X — — — X— — — — 235 255 GAG 115 X — — — X — — — — 1449 1463 CGG 15 — — — X X — —— — 165 X — — — X — — — — 123 X — — — X — — — — 205 X — — — X — — — — 63— X — — X — — — — 413 — — — — X — — — — 146 160 CGG 209 X — — — X — — —— 323 — — — — X — X — — 129 143 CGG 368 385 CCG 77 X — — — X — — — — 415— — — — X — — — — 141 X — — — X — — — — 128 148 CCT 27 — — — — X X — — —65 — X — — X — — — — 185 X — — — X — — — — 299 — — — — X — — X — 5 22CGG 67 — X — — X — — — — 17 — — — X X — — — — 279 — — — — X — — — X 71 X— — — X — — — — 207 X — — — X — — — — 8 25 CCG 417 — — — — X — — — — 127X — — — X — — — — 125 X — — — X — — — — 117 X — — — X — — — — 183 X — —— X — — — — 419 — — — — X — — — — 421 — — — — X — — — — 29 — — — — X X —— — 297 — — — — X — — X — 423 — — — — X — — — — 921 936 AG 425 — — — — X— — — — 245 — — X — X — — — — 427 — — — — X — — — — 429 — — — — X — — —— 247 — — X — X — — — — 249 — — X — X — — — — 159/171 — — X — — — — X 31— X — — X — — — — 275 — — — — X — — — X 217 234 GGC 753 767 CGG 19 — — —— X X — — — 151 X — — — X — — — — 213/227- X — X — — — — OS-FLLEN-9-1,339 353 GTC OS-GW100-4-1, 434 448 AGC MAS24-2, MAS24-3, ZM-CPC-1-4,ZM-CPC-1-6, ZM-CPC-10-1, ZM-IVDOM-10-1, ZM-IVDOM-10-2 ZM-MOIST-1-1,ZM-MOIST-9-2, ZM-PC-9-1, ZM-STC-10-2, ZM-STC-2-2, ZM-DMC-1-2,ZM-DMY-1-3, ZM-DMY-1-5, ZM-DMY-2-4, ZM-GYHA-1-2, ZM-GYUI-9-1,ZM-GYUI-9-1, ZM-GYUI-9-2, ZM-GYUP-9-1, ZM-GYUP-9-2, ZM-HI-1-1,ZM-KW300-9-1, ZM-KW300-9-2 237 — — X — X — — — — 133 X — — — X — X — —239 — — X — X — — — — 161 X — — — X — — — — 61 X — — — X — — — — 47 — X— — X — — — — 219 — — X — X — — — — 259/271- — — X — — — X 93 X — — — X— — — — OS-AE-12-1 111 X — — — X — — — — 275 289 GCG 73 X — — — X — — —— 54 74 CGG 235 — — X — X — — — — 217 — — X — X — — — — 257 — — — — X —— — X 201 X — — — X — — — — OS-AMY-6-1, OS-AMY-6-2, OS-ASS-6-1,OS-GC-6-1, OS-BDV-6-1, OS-CHALK-6-1, OS-CPV-6-1, OS-CPV-6-2, OS-CSV-6-1,OS-CSV-6-2, OS-HPV-6-1, OS-HPV-6-2, OS-SBV-6-1, OS-WC-6-1, OS-DM-6-1,OS-GP-6-1, OS-Y-6-1, MAS24-2, ZM-CPC-6-2, ZM-ID-10-1, ZM-MOIST-10-1,ZM-MOIST-9-2, ZM-MOIST-9-2, ZM-PC-9-1, ZM-STC-10-1, ZM-DMC-10-1,ZM-DMC-10-2, ZM-DMC-6-1, ZM-DMC-6-2, ZM-DMY-10-1, ZM-GWM2-10-1,ZM-GYLD-6-1, ZM-GYLD-6-4, ZM-GYLD-6-4, ZM-GYLD-9-1, ZM-GYUI-9-1,ZM-GYUI-9-2, ZM-GYUP-9-2, ZM-HI-10-1, ZM-KW100-9-1, ZM-KW300-9-1,ZM-KW300-9-2, ZM-TW-10-2, ZM-YLD-6-1, ZM-YLD-9-1 281 — — — — X — — X —251 — — — — X — — — X 3 — — — X X — — — — OS-AE-11-1, 24 38 CGCZM-MOIST-1-6, ZM-MOIST-5-1, ZM-PC-1-2, ZM-GWM2-1-1, ZM-GYHA-5-1,ZM-GYLD-5-3, ZM-HI-1-2, ZM-KW100-1-2 21 — — — — X X — — — OS-AE-12-1 179X — — — X — — — — 319 X — — — X — X — — 41 55 CCG 7 — — — X X — — — —291 — — — — X — — X — 10 24 GAG 169 X — — — X — — — — 83 X — — — X — — —— 269 — — — — X — — — X 9 — — — X X — — — — OS-GPL-4-1, OS-GPP-4-1,OS-GYLD-4-1, MAS24-2, MAS24-28, ZM-CPC-3-2, ZM-ID-10-1, ZM-ID-2-1,ZM-MOIST-10-1, ZM-MOIST-2-2, ZM-MOIST-3-2, ZM-MOIST-4-3, ZM-MOIST-5-3,ZM-MOIST-9-2, ZM-PC-9-1, ZM-STC-10-1, ZM-BIOM-3-1, ZM-DMC-10-1,ZM-DMC-10-2, ZM-DMC-2-3, ZM-DMY-10-1, ZM-DMY-3-1, ZM-DMY-4-3,ZM-EWT-2-1, ZM-GWM2-10-1, ZM-GWM2-3-2, ZM-GYHA-3-1, ZM-GYLD-2-2,ZM-GYLD-3-3, ZM-GYLD-5-2, ZM-GYUI-9-1, ZM-GYUI-9-2, ZM-GYUP-9-2,ZM-HI-10-1, ZM-KW300-3-3, ZM-KW300-9-1, ZM-KW300-9-2, ZM-TW-10-2,ZM-TW-2-3 449 — — — — X — — — X 277 — — — — X — — — X 664 681 ACT 285 —— — — X — — X — OS-PGWC-8-1, OS-FLWID-3-1, OS-GPP-8-2, SMS015-9,ZM-CPC-1-3, ZM-CPC-1-5, ZM-IVDOM-1-2, ZM-IVDOM-1-3, ZM-MOIST-1-3,ZM-MOIST-1-4, ZM-MOIST-4-2, ZM-MOIST-4-3, ZM-PC-1-1, ZM-DMC-1-1,ZM-DMY-1-2, ZM-DMY-1-4, ZM-DMY-4-3, ZM-GYHA-1-1, ZM-GYLD-1-2,ZM-GYUP-1-2, M-TW-1-1 325 — — — — X — X — — OS-PGWC-8-1, OS-FLWID-3-1,OS-GPL-8-2, OS-GPP-8-2, OS-GYLD-8-2, ZM-CPC-1-3, ZM-CPC-1-5,ZM-IVDOM-1-2 ZM-IVDOM-1-3 ZM-MOIST-1-3, ZM-MOIST-1-4, ZM-PC-1-1,ZM-DMC-1-1, ZM-DMY-1-2, ZM-DMY-1-4, ZM-GYHA-1-1, ZM-GYLD-1-2,ZM-GYUP-1-2, ZM-TW-1-1 265 — — — — X — — — X OS-FLLEN-3-1, 65 79 CGGOS-GPL-2-1, OS-GYLD-2-1, MAS24-21, ZM-ID-5-2, ZM-MOIST-4-3,ZM-MOIST-4-4, ZM-MOIST-5-4, ZM-PC-5-1, ZM-STC-5-1, ZM-DMC-5-1,ZM-DMY-4-2, ZM-DMY-4-3, ZM-DMY-4-4, ZM-EWT-4-2, ZM-GYLD-4-1,ZM-GYLD-5-2, ZM-HI-4-1, ZM-KNE-4-1, ZM-KW300-4-2, ZM-KWE-4-1, M-TGW-4-1327 — — — X — X 231 — — X — X — — — — ZM-MOIST-2-3, ZM-MOIST-4-3,ZM-STC-2-2, ZM-DMY-2-3, ZM-DMY-2-4, ZM-DMY-4-3, M-GYLD-2-3 37 — X — — X— — — — 43 — X — — X — — — — ZM-DMY-4-1 293 — — — — X — — X —OS-CIF-6-1, MAS13-32, ZM-CPC-1-3, ZM-CPC-1-5, ZM-IVDOM-1-2,ZM-MOIST-1-4, ZM-MOIST-2-1, ZM-MOIST-9-2, ZM-PC-1-1, ZM-DMC-1-1,ZM-DMY-1-4, ZM-DMY-2-1, ZM-GYLD-2-4, ZM-GYLD-9-1, ZM-GYUP-1-2,ZM-KW100-9-1, ZM-KW300-9-2, ZM-TW-1-1, ZM-YLD-9-1 321 X — — — X — X — —ZM-CPC-6-2, 536 550 CTG ZM-DMC-6-1, ZM-DMC-6-2, ZM-GYLD-6-1,ZM-GYLD-6-4, ZM-GYLD-6-4, ZM-YLD-6-1 79 X — — — X — — — — OS-AMY-5-1 211— — X — X — — — — OS-APDF-9-1, OS-VGT-9-1, OS-GW-9-1 177 X — — — X — — —— OS-CIF-6-1 44 58 CGT 117 131 GGA

TABLE 3 Table 3 provides a further subset of rice genes the expressionof which is up-regulated during grain filling. Further identified areSSR sequences in the coding region of the rice genes. A = structuralprotein B = hypothetical/unknown proteins C = Growth/division anddevelopment D = classification not clear E = Cereal_Grain_Filling_QTLs(a description of the respective QTLs is provided in Table . . . below)F = Beginning of the SSR G = End of the SSR H = Nucleotide Sequence ofthe trinucleotide repeat unit SEQ ID A B C D E F G H 329 — X — — 331 — —— X 332 X — — — 333 — X — — 334 — X — — 335 — X — — 343 — — — X 23 — X —— 345 — X — — 351 — X — — 355 — X — — 357 — X — — 361 — X — — 363 — X —— 365 — — — X 369 — — — X 371 — — X — 373 — X — — 313 — X — — 375 — X —— 377 — X — — 379 — X — — 381 — — — X 383 — X — — 387 — X — — 389 — X —— 393 — X — — 395 — X — — 99 — — — X 397 — X — — 229 — X — — 403/431- —— 16 39 CCG 433 — X — — OS-AMY-5-1, MAS13-31, SMS021-80, ZM-CPC-5-1,ZM-CPC-7-2, ZM-IVDOM-5-1, ZM-IVDOM-5-2, ZM-MOIST-5-2, ZM-MOIST-5-2,ZM-MOIST-5-3, ZM-MOIST-7-1, ZM-BIOM-5-1, ZM-BIOM-7-1, ZM-DMY-5-1,ZM-GWM2-7-1, ZM-GYLD-5-1, ZM-GYLD-5-3, ZM-HI-7-1, ZM-KW300-5-1,ZM-TW-5-1 435 — — — X 437 — X — — 439 — X — — OS-YLD-3-2, ZM-ID-5-1,ZM-IVDOM-5-3, ZM-GYLD-5-3 441 — — — X OS-REGEN-5-1, 1912 1929 CGGMAS12-18, MAS24-16, SMS015-16, SMS021-81, ZM-ID-6-1, ZM-ID-6-1,ZM-ID-8-1, ZM-ID-8-1, ZM-ID-8-1, ZM-MOIST-5-1, ZM-MOIST-6-2, ZM-PC-8-1,ZM-STC-6-1, ZM-STC-8-1, ZM-VT-6-1, ZM-BIOM-8-1, ZM-DMC-8-1, ZM-DMY-8-1,ZM-DMY-8-2, ZM-GYHA-5-1, ZM-GYHA-6-1, ZM-GYLD-5-3, ZM-GYLD-6-2,ZM-GYLD-6-3, ZM-HI-8-1, ZM-KW300-6-2 443 — — — X OS-RGT-12-2, 117 131CGG OS-GWPL-12-1 1962 1979 CGG 445 — X — — 447 — X — — OS-YLD-3-2 95 — —— X OS-CIF-8-1, OS-GW-8-1, MAS13-24, ZM-CPC-1-3, ZM-CPC-1-5,ZM-IVDOM-1-2, ZM-IVDOM-1-4, ZM-MOIST-1-4, ZM-MOIST-1-5, ZM-PC-1-1,ZM-DMC-1-1, ZM-DMC-6-2, ZM-DMY-1-4, ZM-GYLD-6-4, ZM-GYUP-1-2,ZM-KW300-1-2, ZM-TW-1-1, ZM-YLD-6-1 451 — X — — OS-PGWC-12-1, 962 976GCA OS-BDV-12-1, OS-PKV-12-1 453 — X — — 27 47 CCT 344 358 GCG 455 — X —— MAS24-28, ZM-ID-10-1, ZM-ID-2-1, ZM-MOIST-10-1, ZM-MOIST-2-2,ZM-MOIST-4-3, ZM-MOIST-5-3, ZM-STC-10-1, ZM-DMC-10-1, ZM-DMC-10-2,ZM-DMC-2-3, ZM-DMY-10-1, ZM-DMY-4-3, ZM-EWT-2-1, ZM-GWM2-10-1,ZM-GYLD-2-2, ZM-GYLD-5-2, ZM-HI-10-1, ZM-TW-10-2, ZM-TW-2-3 457 — X — —459 — X — — OS-PGWC-12-1, 53 73 CGG OS-BDV-12-1, OS-PKV-12-1 461 — X — —OS-GW-11-1, ZM-IVDOM-9-1, ZM-IVDOM-9-2, ZM-GYLD-9-2, ZM-KW100-9-1,ZM-TGW-9-2

TABLE 4 Genes involved in rice grain filling, which belong to thefunctional category of stress response proteins Rice Banana Wheat Maize(SEQ (SEQ (SEQ (SEQ ID ID ID ID NO) NO) NO) NO) Gene Description 1 —1065 1182 Similar to MPV1_HUMAN P39210 HOMO SAPIENS (HUMAN). MPV17PROTEIN. 3 1115 Similar to ANRX_ANASP Q44141 ANABAENA SP. (STRAIN PCC7120). ANAREDOXIN. 5 939 1030 1184 Similar to gi|20286|emb|CAA46916.1|peroxidase [Oryza sativa] 7 935 1037 — Similar togi|1620753|gb|AAB17095.1| proteinase inhibitor [Oryza sativa] 9 934 10111110 Similar to gi|3287683|gb|AAC25511.1| Similar to apoptosis proteinMA-3 gb|D50465 from Mus musculus. [Arabidopsis thaliana] 11 — 952 1198Similar to gi|5725430|emb|CAB52439.1| stress responsive protein homolog[Arabidopsis thaliana] 13 — 998 1175 15 — 1015 1167 17 899 1042 1161

TABLE 5 Genes involved in rice grain filling, which belong to thefunctional category of signaling molecules Rice Banana Wheat Maize (SEQ(SEQ (SEQ (SEQ ID ID ID ID NO) NO) NO) NO) Gene Description 19 — 1089 —Similar to gi|1352683|sp|P49599| P2C3_ARATH PROTEIN PHOSPHATASE 2C PPH1(PP2C) 21 — 971 — Similar to gi|7269803|emb|CAB79663.1|serine/threonine-specific kinase like protein [Arabidopsis thaliana] 23Similar to gi|6520139|dbj| BAA87936.1| ZW9 [Arabidopsis thaliana] 25 —1071 1120 Similar to gi|9293975|dbj| BAB01878.1| receptor protein kinase[Arabidopsis thaliana] 27 916 1049 — 29 — 984 1186

TABLE 6 Genes involved in rice grain filling, which belong to thefunctional category of transmembrane proteins Rice Banana Wheat Banana(SEQ (SEQ (SEQ (SEQ ID ID ID ID NO NO) NO) NO) Gene Description 31 —1025 — (nitrite transporter) 33 — 1047 — (amino a selective channelprotein) 35 950 959 1164 (G6P transporter in plastids) 37 (PTR2 POTfamily) 39 949 1017 — (Leucine rich protein) 41 927 962 1112(immunoglobulin) 43 917 982 1109 (dehydrogenase) 45 — 954 1117 (putativetransport protein) 47 921 1099 1152 (phosphate transporter) 49 891 10401128 (monosaccarid (hexose) transporter) 51 — 994 — (PTR2 POT family) 53— 1067 1159 (cation transporter protein Ec) 55 — 1047 — (amino aselective channel protein) 57 (sugar transporter) 59 — 1077 —(transporter protein) 61 — 1085 — Similarity[ab043024_34-1656/codon_start = 1 /db_xref = “gi: 8051712” / product = “sodium sulfate ordicarboxylate transporter” /protein_id = “baa96091.1” ] Evidence[100%(1510/1510)] 63 — 1105 — Similar to gi|7523692|gb|AAF63131.1| AC011001_1Putative chloroplast inner envelope protein [Arabidopsis thaliana] 65 —957 1114 Similar to PITH_STRHA P41132 STREPTOMYCES HALSTEDII. PUTATIVELOW- AFFINITY INORGANIC PHOSPHATE TRANSPORTER (FRAGMENT) 67 944 1075 —Similar to PTR2_YEAST P32901 SACCHAROMYCES CEREVISIAE (BAKER S YEAST).PEPTIDE TRANSPORTER PTR2 (PEPTIDE PERMEASE PTR2).

TABLE 7 Genes involved in rice grain filling, which belong to thefunctional category of carbohydrate metabolism STARCH METABOLISM RiceBanana Wheat Maize (SEQ ID (SEQ ID (SEQ ID (SEQ ID NO) NO) NO) NO) GeneDescription Branching Enzyme  69 888 1058 — Similar to GLGB_ORYSA Q01401ORYZA SATIVA (RICE). 1,4-ALPHA-GLUCAN BRANCHING ENZYME (EC 2.4.1.18)(STARCH BRANCHINGENZYME) (Q-ENZYME).  71 — 1026 1157 Similar togi|4584507|emb|CAB40745.1| starch branching enzyme II [Solanumtuberosum]  73 — 1018 1157 gi|3851526|gb|AAC72335.1| starch branchingenzyme IIa [Hordeum vulgare Debranching Enzyme  75 —  987 —gi|1783306|dbj|BAA09167.1| starch debranching enzyme precursor [Oryzasativa]  77 —  966 — Similar to gi|3252794|dbj|BAA29041.1| isoamylase[Oryza sativa] Starch degradation Alpha - Amylases  79 909 1083 1173Similar to AMYM_BACST P19531 BACILLUS STEAROTHERMOPHILUS. MALTOGENICALPHA-AMYLASE PRECURSOR (EC 3.2.1.133) (GLUCAN 1,4-ALPHA-MALTOHYDROLASE) 81 887 1035 1150 Similar to gi|426482| Alpha-amylase  83 887 1033 1150|CAA39777.1| Alpha- amylase  85 — 1033 1150 |CAA39777.1| Alpha- amylase 87 887 1033 1151 |PF00128| Alpha-amylase   89; 887 1032 1150gi|426482|aaa50161.1| Alpha-amylase 509  91 — 1034 1150gi|113766|sp|P17654|AMY1_ORYSA ALPHA-AMYLASE PRECURSOR(1,4-ALPHA-D-GLUCAN GLUCANOHYDROLASE) (ISOZYME 1B) alpha-AmylaseInhibitor  93  95 Motifs{Cereal_Tryp_Amyl_Inh Cereal trypsin/alpha-amylase inhibitors family; Pfam6_1|PF00234|tryp_alpha_amylProtease inhibitor/seed storage family} Evidence[100% (474/474)]  97Motifs{Aldehyde_Dehydr_Cys Aldehyde dehydrogenases active sites;Cereal_Tryp_Amyl_Inh Cereal trypsin/alpha-amylase inhibitors family}Evidence[99% (486/489)]  99 Motifs{Cereal_Tryp_Amyl_Inh Cereal trypsin/alpha-amylase inhibitors family; Pfam6_1|PF00234|tryp_alpha_amylProtease inhibitor/seed storage family} Evidence[100% (501/501)]Beta-Amylase 101 —  965 1107 Similarity[y16242_1-1798 /codon_start = 2/db_xref = “gi: 4138596” /partial = true /product = “beta-amylase”/protein_id = “caa76131.1”] Evidence[100% (931/931)]. 103 926  956 1156Similarity[z25871_48-1514 /codon_start = 1 /db_xref =“swiss-prot:p55005” /ec_number = “3.2.1.2” /product = “beta-amylase”/protein_id = “caa81091.1”] Evidence[100% (1539/1539)] 105 —  955 —gi|1703302|sp|P55005|AMYB_MAIZE BETA- AMYLASE (1,4-ALPHA-D-GLUCANMALTOHYDROLASE) 107 —  965 1106 gi|3334120|sp|P93594|AMYB_WHEAT BETA-AMYLASE (1,4-ALPHA-D-GLUCAN MALTOHYDROLASE) Pullulanase 109 —  987 —Similarity[ab012915_2206-14924 /codon_start = 1 /db_xref = “gi: 3172048”/product = “starch debranching enzyme” /protein_id = “baa28632.1” /note= “pullulanase”] Evidence[100% (3079/3079)] 887 1032 1150 Glucosidase111 — 1005 — Similar to AMYG_NEUCR P14804 NEUROSPORA CRASSA.GLUCOAMYLASE PRECURSOR (EC 3.2.1.3) (GLUCAN 1,4-ALPHA-GLUCOSIDASE)-(1,4-ALPHA-D-GLUCAN GLUCOHYDROLASE). 113 905 1021 —|CAA04707.1| Alpha-glucosidase 115 — 1086 1144gi|3023275|sp|Q43763|AGLU_HORVU ALPHA- GLUCOSIDASE PRECURSOR (MALTASE)117 gi|544151|sp|Q99040|DEXB_STRMU GLUCAN 1,6-ALPHA-GLUCOSIDASE (DEXTRANGLUCOSIDASE) (EXO-1,6-ALPHA- GLUCOSIDASE) (GLUCODEXTRANASE) SuroseSynthase 119 932 1006 1148 Similar to SUS2_ARATH Q00917 ARABIDOPSISTHALIANA (MOUSE-EAR CRESS). SUCROSE SYNTHASE (EC 2.4.1.13) (SUCROSE-UDPGLUCOSYLTRANSFERASE). 121 930 1022 1170 gi|283009|pir∥S22535 sucrosesynthase (EC 2.4.1.13) 1 - rice (fragment) 123 930 1028 1170gi|20366|emb|CAA46017.1| sucrose synthase [Oryza sativa] 125 930 10541170 gi|267055|sp|Q00917|SUS2_ARATH SUCROSE SYNTHASE (SUCROSE-UDPGLUCOSYLTRANSFERASE) 127 930 1054 1191 gi|66572|pir∥YUMU sucrosesynthase (EC 2.4.1.13) - Arabidopsis thaliana Starch Synthase 129 — 1066— Similar to UGS3_SOLTU Q43847 SOLANUM TUBEROSUM (POTATO). GLYCOGEN(STARCH) SYNTHASE PRECURSOR (EC 2.4.1.11) (GBSSII) (GRANULE-BOUND STARCHSYNTHASE II) (FRAGMENT) 131 924 1070 1125 Similar togi|3057122|gb|AAC14015.1| starch synthase DULL1 [Zea mays 133 947 10551155 Similar to gi|5257102|gb|AAD41242.1| granule bound starch synthase[Oryza sativa subsp. japonica] ADPG pyrophosphorylase 135 —  989 1193Similar to gi|3093462|gb|AAC15247.1| ADP-glucose pyrophosphorylase largesubunit [Oryza sativa] 137 922 1098 — Similarity[ay028315_115-1617/codon_start = 1 /db_xref = “gi: 13508485” /product = “adp-glucosepyrophosphorylase small subunit” /protein_id = “aak27313.1” /note =“putative amyloplast form”] Evidence[100% (1520/1520)] 139 —  989 1193Similarity[ac007858_66917-70303 /codon_start = 1 /db_xref = “gi:5091608” /evidence = “not_experimental” /gene = “10a19i.12” /protein_id= “aad39597.1” /note = “identical to gb|d50317 adp glucosepyrophosphorylase large subunit from oryza sativa. ests dbj|d22125 anddbj|d15718 come from”] Evidence[100% (1615/1615)] Gene[10A19I.12Identical to gb|D50317 ADP glucose pyrophosphorylase large subunit fromOryza sativa. ESTs dbj|D22125 and dbj|D15718 come from] 141 922 10981193 Similar to gi|169759|gb|AAA33890.1| ADP- glucose pyrophosphorylase51 kD subunit (EC 2.7.7.27) Triosephosphate Isomerase 143 912 1046 1133Similarity[z32521_64-960 /codon_start = 1 /db_xref = “swiss-prot:p46225” /ec_number = “5.3.1.1” /product = “triosephosphate isomerase”/protein_id = “caa83533.1”] Evidence[100% (822/822)] 145 912 1046 1133db_xref = “swiss-prot: p46225” /ec_number = “5.3.1.1” /product =“triosephosphate isomerase” /protein_id = “caa83533.1”] Evidence[100%(822/822)] 147 890 1003 1134 Similarity[j04121_1-762 /codon_start = 1/db_xref = “gi: 556171” /product = “triosephosphate isomerase”/protein_id = “aab62730.1”] Evidence[100% (683/683)] Other proteinsinvolved in starch metabolism 149 936 1043 1194Similarity[x53130_51-1127 /codon_start = 1 /db_xref = “swiss-prot:p17784” /protein_id = “caa37290.1” /note = “fructose-diphosphatealdolase (aa 1-358)”] Evidence[100% (1078/1078)] 151 —  963 1124|AAA45939.1|Alpha-1,4-glucan phosphorylase h isozyme 153 950  959 1164Similarity[af020813_273-1436 /codon_start = 1 /db_xref = “gi: 2997589”/function = “mediates the antiport of glucose-6-phosphate againstphosphate in plastids of heterotrophic tissues” /gene = “gpt” /product =“glucose-6-phosphate/phosphate-translocator precursor” /protein_id =“aac08524.1”  155; 913 — 1154 gi|4539316|emb|CAB38817.1| putativefructose- 507 bisphosphate aldolase [Arabidopsis thaliana] 157 — 1069 —Motifs{Pfam6_1|PF00702|Hydrolase haloacid dehalogenase-like hydrolase}Evidence[82% (1032/1254)] 159 — 1023 — Similarity[u17225_40-1743/codon_start = 1 /db_xref = “gi: 596023” /ec_number = “5.3.1.9” /gene =“phil” /product = “glucose-6 phosphate isomerase” /protein_id =“aaa82734.1” /note = “phosphohexose isomerase”] Evidence[100%(1889/1889)] Gene[phil 5.3.1.9 glucose-6 phosphate isomerasephosphohexose isomerase] 161 946 1103 1189 Similarity[ab013353_89-1504/codon_start = 1 /db_xref = “gi: 3107931” /product = “udp-glucosepyrophosphorylase” /protein_id = “baa25917.1”] Evidence[100%(1582/1582)] 163 937  970 1153 Similarity[af372833_47-1273 /codon_start= 1 /db_xref = “gi: 13991929” /product = “phos-phoenolpyruvate/phosphate translocator” /protein_id = “aak51561.1” /note= “ppt”] Evidence[100% (1239/1239)] 165 892  964 1179 Similar togi|5231119|gb|AAD41079.1|AF143202_1 starch phosphorylase L [Solariumtuberosum]; gi|130172|sp|P27598|PHSL_IPOBA ALPHA-1,4 GLUCANPHOSPHORYLASE, L ISOZYME, CHLOROPLAST PRECURSOR (STARCH PHOSPHORYLASE L)167 902  997 — Motifs{Pfam6_1|PF01591|6PF2K 6-phosphofructo- 2-kinase;Atp_Gtp_A ATP/GTP-binding site motif A (P-loop)} Evidence[71%(2205/3069)] 169 946 1050 — Similarity[ap001383_68171-73040 /codon_start= 1 /db_xref = “gi: 7242911” /protein_id = “baa92509.1” /note = “similarto udp-glucose pyrophosphorylase. (x91347)”] Evidence[100% (1528/1528)171 — 1023 — Similarity[u17225_40-1743 /codon_start = 1 /db_xref = “gi:596023” /ec_number = “5.3.1.9” /gene = “phil” /product = “glucose-6phosphate isomerase” /protein_id = “aaa82734.1” /note = “phosphohexoseisomerase”] Evidence[100% (1889/1889)] Gene[phil 5.3.1.9 glucose-6phosphate isomerase phosphohexose isomerase] 173 —  975 —Similarity[d45218_54-1760 /codon_start = 1 /db_xref = “gi: 639686”/product = “phosphoglucose isomerase (pgi-b)” /protein_id =“baa08149.1”] Evidence[100% (1409/1409)] 175 937  970 1153Similarity[af372833_47-1273 /codon_start = 1 /db_xref = “gi: 13991929”/product = “phosphoenolpyruvate/phosphate translocator” /protein_id =“aak51561.1” /note = “ppt”] Evidence[100% (1050/1050)] 177 889 1081 1196Motifs{Pfam6_1|PF00274|glycolytic_enzy Fructose-bisphosphate aldolaseclass-I; Aldolase_Class_I Fructose-bisphosphate aldolase class-I activesite} Evidence[65% (1082/1650)] 179 —  977 1180Similarity[z32850_352-4957 /codon_start = 1 /db_xref = “swiss-prot:q41141” /product = “pyrophosphate-dependent phosphofructokinasebetasubunit” /protein_id = “caa83683.1”] Evidence[100% (1698/1698)] 181892  964 1179 Similarity[af095521_76-1923 /codon_start = 1 /db_xref =“gi: 3790102” /ec_number = “2.7.1.90” /gene = “ppi-pfka” /product =“pyrophosphate- dependent phosphofructokinase alpha subunit” /protein_id= “aac67587.1”] Evidence[100% (1939/1939)] Gene[PPi-PFKa 2.7.1.90pyrophosphate-dependent phosphofructo- kinase alpha subunit] 183 906 988 1113 gi|3122594|sp|Q59126|PFP_AMYME PYROPHOSPHATE--FRUCTOSE6-PHOSPHATE 1-PHOSPHOTRANSFERASE (6-PHOSPHOFRUCTOKINASE (PYROPHOSPHATE))(PYROPHOSPHATE-DEPENDENT 6- PHOSPHOFRUCTOSE-1-KINASE) (PPI-PFK) 185 8961014 1180 gi|2499488|sp|Q41140|PFPA_RICCO PYROPHOSPHATE--FRUCTOSE6-PHOSPHATE 1-PHOSPHOTRANSFERASE ALPHA SUBUNIT (PFP)(6-PHOSPHOFRUCTOKINASE (PYROPHOSPHATE)) (PYROPHOSPHATE-DEPENDENT6-PHOSPHOFRUCTOSE-1- KINASE) (PPI-PFK)  187; 911 — 1138gi|3913641|sp|O64422|F16P_ORYSA 511 FRUCTOSE-1,6-BISPHOSPHATASE,CHLOROPLAST PRECURSOR (D-FRUCTOSE-1,6-BISPHOSPHATE 1-PHOSPHOHYDROLASE)(FBPASE) Non-Starch Carbohydrate Metabolism 189 912 1046 1133Similarity[z3252l_64-960 /codon_start = 1 /db_xref = “swiss-prot:p46225” /ec_number = “5.3.1.1” /product = “triosephosphate isomerase”/protein_id = “caa83533.1”] Evidence[100% (822/822)]  191; — 1052 1121Similar to gi|9294516|dbj|BAB02778.1| contains 503 similarity toendo-1,3-1,4-beta-D-glucanase˜gene_id: MDB19.8 [Arabidopsis thaliana]193 Similar to PTSN_ECOLI P31222 ESCHERICHIA COLI. NITROGEN REGULATORYIIA PROTEIN (EC 2.7.1.69) (ENZYME IIA- NTR)(PHOSPHOTRANSFERASE ENZYMEII, A COMPONENT); Motifs{Cytochrome_C Cytochrome c family heme-bindingsite; Zinc_Finger_C2h2_1 Zinc finger, C2H2 type, domain;Zinc_Finger_C2h2_1 Zinc finger, C2H2 type, domain; Zinc_Finger_C2h2_1Zinc finger, C2H2 type, domain) Evidence[0% (0/2145)] 195 — 1041 1137Similar to gi|6714431|gb|AAF26119.1|ACO12328_22 putative cellulosesynthase catalytic subunit [Arabidopsis thaliana] 197 Similar togi|22327|emb|CAA37998.1| corn Hageman factor inhibitor [Zea mays] 199 —1096 — gi|728850|sp|P08640|AMYH_YEAST GLUCOAMYLASE S1/S2 PRECURSOR(GLUCAN 201 Elements[GC_box@16653 TATA_box@16019 ATG@15968 PolyA@10370]Evidence[88% (2550/2886) 203 — 1020 1140 Similar togi|3850573|gb|AAC72113.1| Similar to gi|1652733 glycogen operon proteinGlgX from Synechocystis sp. genome gb|D90908.ESTs gb|H36690,gb|AA712462, gb|AA651230 and gb|N95932 come from this gene. [Arabidopsisthaliana] 205 904 1095 1130 Similar to gi|5441877|dbj|BAA82375.1|Similar to glycogenin glucosyltransferase (EC 2.4.1.186). (Z97341)[Oryza sativa] 207 895 1076 1181 Similar togi|8777412|dbj|BAA97002.1|indole-3- glycerol phosphate synthase[Arabidopsis thaliana] 209 — 1101 — gi|14156|sp|P13526|ARLC_MAIZEANTHOCYANIN REGULATORY LC PROTEIN

TABLE 8 Genes involved in rice grain filling, which belong to thefunctional category of storage proteins Rice Banana Wheat Maize (SEQ(SEQ (SEQ (SEQ ID ID ID ID NO) NO) NO) NO) Gene Description 211gi|121099|sp| P08079|GDB0_WHEAT GAMMA-GLIADIN PRECURSOR 213 — 1044 1165Similar to GL19_ORYSA P29835 ORYZA SATIVA (RICE). 19 KD GLOBULINPRECURSOR (ALPHA- GLOBULIN). 215 Similar to gi|224389|prf∥ 1103218Aglycinin A5 [Glycine max] 217 Similar to gi|296129|emb| CAA46197.1|prolamin [Oryza sativa] 219 Similar to gi|7209261|emb| CAB76962.1|alpha-gliadin [Triticum aestivum] Similar to gi|4126695|dbj| BAA36699.1|prolamin [Oryza sativa] 221 Similar to METC_RHILV Q52811 RHIZOBIUMLEGUMINOSARUM (BIOVAR VICIAE). PUTATIVE CYSTATHIONINE BETA-LYASE (EC4.4.1.8) (CBL) (BETA-CYSTATHIONASE) (CYSTEINE LYASE) (ORF5) (FRAGMENT).223 — 960 — Similar to GU11_(—) ORYSA P07728 ORYZA SATIVA (RICE).GLUTELIN TYPE I PRECURSOR (CLONE PREE 61). 225 — 1068 — Similar togi|226227|prf∥ 1502200A prolamin [Avena sativa] 227 — 1044 1165gi|232161|sp| P29835|GL19_ORYSA 19 KD GLOBULIN PRECURSOR 229 — 960 —Similar to gi|169969|gb| AAA33964.1| glycinin 231 948 953 1176 Similarto PRVA_RANCA P18087 RANA CATESBEIANA (BULL FROG). PARVALBUMIN ALPHA (PA4.97). 233 — 991 — gi|121101|sp| P08453|GDB2_WHEAT GAMMA-GLIADINPRECURSOR 235 — 960 — Similar to gi|20227|emb|CAA32566.1|preprolglutelin (AA −24 to 476) [Oryza sativa] 237 — 1073 1190 Similarto PRVT_CHICK P19753 GALLUS GALLUS (CHICKEN). PARVALBUMIN, THYMIC (AVIANTHYMIC HORMONE) (ATH) (THYMUS- SPECIFICANTIGEN T1). 239 Similar togi|20208|emb| CAA38211.1| glutelin [Oryza sativa] 241 Similar togi|556407|gb| AAA50319.1| prolamin 243 Similar to gi|166555|gb|AAA32715.1| avenin 245 — 1048 — gi|1170517|sp| P45386|IGA4_HAEINIMMUNOGLOBULIN A1 PROTEASE PRECURSOR 247 gi|121090|sp| P04721|GDA1_WHEATALPHA/BETA-GLIADIN A-I PRECURSOR 249 gi|121101|sp| P08453|GDB2_WHEATGAMMA-GLIADIN PRECURSOR

TABLE 9 Genes involved in rice grain filling, which belong to thefunctional category of Fatty Acid Metabolism Rice Banana Wheat Maize(SEQ (SEQ (SEQ (SEQ ID ID ID ID NO) NO) NO) NO) Gene Description 251 920976 1131 Similar to PHLB_SERLI P18954 SERRATIA LIQUEFACIENS. PHLBPROTEIN PRECURSOR. 253 — 995 — Similar to LPXK_FRANO Q47909 FRANCISELLANOVICIDA. PROBABLE TETRAACYLDISACCHARIDE 4 -KINASE (EC 2.7.1.130) (LIPIDA 4 -KINASE). 255 — 972 1126 Similar to gi|7339489|emb| CAB82812.1|phospho- lipase-like protein [Arabidopsis thaliana] 257 — 1087 1177Similar to OLE2_ORYSA Q40646 ORYZA SATIVA (RICE). OLEOSIN 18 KD(OSE721). Similar to gi|1171354| gb|AAC02240.1| 18 kDa oleosin [Oryzasativa] 259 — 1100 1132 Similar to gi|4455257|emb| CAB36756.1| oleosin,18.5 K [Arabidopsis thaliana] 261 910 1093 1158 Similar to KSU5_ECOLIP42216 ESCHERICHIA COLI. 3-DEOXY-MANNO- OCTULOSONATE CYTIDYLYLTRANS-FERASE (EC 2.7.7.38) (CMP-KDOSYNTHETASE) (CMP-2-KETO-3- DEOXYOCTULOSONICACID SYNTHETASE) (CKS). 263 884 1038 1172 Similar to ACBP_GOSHI Q39779GOSSYPIUM HIRSUTUM (UPLAND COTTON). ACYL-COA-BINDING PROTEIN (ACBP). 265915 990 1122 Similar to gi|4587543|gb| AAD25774.1| AC006577_10 Belongsto the PF|00657 Lipase/Acylhydrolase with GDSL-motif family.ESTgb|AB015099 comes from this gene. [Arabidopsis thaliana] 267 897 10821195 Similar to GBSB_BACSU P71017 BACILLUS SUBTILIS. ALCOHOLDEHYDROGENASE (EC 1.1.1.1). 269 — 961 — Similar to gi|6714447|gb|AAF26134.1| AC011620_10 putative phospholipase D [Arabidopsis thaliana]271 — 1100 1132 Similar to gi|1171352|gb| AAC02239.1| 16 kDa oleosin[Oryza sativa] Similar to gi|944830|emb| CAA43183.1| soybean 24 kDaoleosin isoform [Glycine max] 273 886 1012 1178 Similar togi|7576210|emb| CAB87871.1| palmitoyl- protein thioesteraseprecursor-like [Arabidopsis thaliana] 275 Similar to 3O1D_COMTE Q06401COMAMONAS TESTOSTERONI (PSEUDOMONAS TESTOSTERONI). 3-OXOSTEROID 1-DEHYDROGENASE (EC 1.3.99.4). 277 — 951 1160 Similar to CRTI_PHYBL P54982PHYCOMYCES BLAKESLEEANUS. PHYTOENE DEHYDROGENASE (EC 1.3.-.-) (PHYTOENEDESATURASE). 279 — 973 — Similar to gi|6648208|gb|AAF21206.1|AC013483_30 putative phosphatidylinositol- 4-phosphate5-kinase [Arabidopsis thaliana]

TABLE 10 Genes involved in rice grain filling, which belong to thefunctional category of amino acid metabolism Rice Banana Wheat Maize(SEQ (SEQ (SEQ (SEQ ID ID ID ID NO) NO) NO) NO) Gene Description 281 —1053 — Similar to gi|2076884|gb| AAB539751| lysine- ketoglutaratereductase/saccharopine dehydrogenase [Arabidopsis thaliana] 283 — 10361199 Similar to gi|974605|gb| AAA75104.1| single- stranded nucleic acidbinding protein 285 — 978 — 68173.m01963#MAL21_29# AT3g20250#RNA-bindingprotein, putative- Length = 955 287 918 1008 1139 gi|730108|sp|Q00539|NAM8_YEAST NAM8 PROTEIN 289 928 1061 — Similar to gi|287298|dbj|BAA03504.1| aspartate aminotransferase [Oryza sativa] 291 923 980 1141Similar to MTAP_HUMAN Q13126 HOMO SAPIENS (HUMAN). 5-METHYLTHIO-ADENOSINE PHOSPHORYLASE (EC 2.4.2.28) (MTAPHOSPHORYLASE) (MTAPASE). 293Similar to SEPR_THESP P80146 THERMUS SP. (STRAIN RT41A). EXTRACELLULARSERINE PROTEINASE PRECURSOR (EC 3.4.21.-). 295 903 1019 — Similar togi|6728985|gb| AAF26983.1|AC018363_28 putative S-adenosylmethionine:2-demethylmenaquinone methyltransferase [A thaliana] 297) — 1092 —68173.m01963#MAL21_29# AT3g20250#RNA-binding protein, putative- Length =955 299 — 986 1169 Similar to IF4H_HUMAN Q15056 HOMO SAPIENS (HUMAN).EUKARYOTIC TRANSLATION INI- TIATION FACTOR 4H (EIF-4H) (KIAA0038).

TABLE 11 Genes involved in rice grain filling, which belong to thefunctional category of transcription factors Rice Banana Wheat Maize(SEQ (SEQ (SEQ (SEQ ID ID ID ID NO) NO) NO) NO) Gene Description 301Similar to gi|7211973|gb| AAF40444.1| AC004809_2 Contains similarity tothe CREB-binding protein (CBP) from Mus sp gb|S66385. [Arabidopsisthaliana] 303 — 974 — Similar to gi|6899934|emb| CAB71884.1| putativezinc-finger protein [A thaliana] 305 gi|2493550|sp| Q02516|HAP5_YEASTTRANSCRIPTIONAL ACTIVATOR HAP5 307 898 1091 1201 Similar togi|403418|gb| AAA18414.1| GBF4 309 68170.m04237#F14G24_15#At1g52880#NAM-like proteinLength = 320 311 933 996 1129 Myb familytranscription factor 313 Myb family transcription factor 315 943 10721119 Myb family transcription factor 317 — 1007 — Myb familytranscription factor 319 — 1013 1143 Similarity[af007269_37269- 38693/gene = “a_ig002n01.20” /protein_id = “aab61027.1” /note = “containsweak similarity to myb-related proteins” ] Evidence[100% (559/559)] 321— 1097 1135 Motifs{Myb_2 Myb DNA-binding domain repeat; Myb_2 MybDNA-binding domain repeat} Evidence[38% (306/804)] 323 940 981 1197Similar to gi|2894607|emb| CAA17141.1| NAM (no apical meristem)-likeprotein [Arabidopsis thaliana] 325 — — 1171 Similar to gi|2224929|gb|AAC49747.1| ethylene- insensitive3-like2 [Arabidopsis thaliana] 327 —979 1174 Myb DNA-binding domain repeat; Myb_2 Myb DNA- binding domainrepeat; Myb_2 Myb DNA-binding domain repeat} Evidence[69% (615/879)]

Example 5 Rice Orthologs of Arabidopsis Grain Filling Genes Identifiedby Reverse Genetics

Understanding the function of every gene is the major challenge in theage of completely sequenced eukaryotic genomes. Sequence homology can behelpful in identifying possible functions of many genes. However,reverse genetics, the process of identifying the function of a gene byobtaining and studying the phenotype of an individual containing amutation in that gene, is another approach to identify the function of agene.

Reverse genetics in Arabidopsis has been aided by the establishment oflarge publicly available collections of insertion mutants (Krysan etal., (1999) Plant Cell 11, 2283-2290; Tisser et al., (1999) Plant Cell11, 1841-1852; Speulman et al., (1999). Plant Cell 11, 1853-1866;Parinov et al., (1999). Plant Cell 11, 2263-2270; Parinov andSundaresan, 2000; Biotechnology 11, 157-161). Mutations in genes ofinterest are identified by screening the population by PCR amplificationusing primers derived from sequences near the insert border and the geneof interest to screen through large pools of individuals. Poolsproducing PCR products are confirmed by Southern hybridization andfurther deconvoluted into subpools until the individual is identified(Sussmnan et al., (2000) Plant Physiology 124, 1465-1467).

Recently, some groups have begun the process of sequencing insertionsite flanking regions from individual plants in large insertion mutantpopulations, in effect prescreening a subset of lines for genomicinsertion sites (Parinov et al., (1999). Plant Cell 11, 2263-2270;Tisser et al., (1999). Plant Cell 11, 1841-1852). The advantage to thisapproach is that the laborious and time-consuming process of PCR-basedscreening and deconvolution of pools is avoided.

A large database of insertion site flanking sequences from approximately100,000 T-DNA mutagenized Arabidopsis plants of the Columbia ecotype(GARLIC lines) is prepared. T-DNA left border sequences from individualplants are amplified using a modified thermal asymmetricinterlaced-polymerase chain reaction (TAIL-PCR) protocol (Liuet al.,(1995). Plant J. 8, 457-463). Left border TAIL-PCR products aresequenced and assembled into a database that associates sequence tagswith each of the approximately 100,000 plants in the mutant collection.Screening the collection for insertions in genes of interest involves asimple gene name or sequence BLAST query of the insertion site flankingsequence database, and search results point to individual lines.Insertions are confirmed using PCR.

Analysis of the GARLIC insert lines suggests that there are 76,856insertions that localize to a subset of the genome representing codingregions and promoters of 22,880 genes. Of these, 49,231 insertions liein the promoters of over 18,572 genes, and an additional 27,625insertions are located within the coding regions of 13,612 genes.Approximately 25,000 T-DNA left border mTAIL-PCR products (25% of thetotal 102,765) do not have significant matches to the subset of thegenome representing promoters and coding regions, and are thereforepresumed to lie in noncoding and/or repetitive regions of the genome.

The Arabidopsis T-DNA GARLIC insertion collection is used to investigatethe roles of certain genes in the grain filling process. Target genesare chosen using a variety of criteria, including public reports ofmutant phenotypes, RNA profiling experiments, and sequence similarity togenes implicated in grain filling. Plant lines with insertions in genesof interest are then identified. Each T-DNA insertion line isrepresented by a seed lot collected from a plant that is hemizygous fora particular T-DNA insertion. Plants homozygous for insertions ofinterest are identified using a PCR assay. The seed produced by theseplants is homozygous for the T-DNA insertion mutation of interest.

Homozygous mutant plants are tested for altered grain composition. Thegenes interrupted in these mutants contribute to the observed phenotype.The genes interrupted in these mutants interfere with the normal grainfilling process.

Rice orthologs of the Arabidopsis genes affecting the grain fillingprocess and thus grain composition are identified by similaritysearching of a rice database using the Double-Affine Smith-Watermanalgorithm (BLASP with e values better than ⁻¹⁰).

Example 6 Cloning and Sequencing of Nucleic Acid Molecules from Rice

6.1 Genomic DNA:

Plant genomic DNA samples are isolated from a collection of tissueswhich are listed in Table 1. Individual tissues are collected from aminimum of five plants and pooled. DNA can be isolated according to oneof the three procedures, e.g., standard procedures described by Ausubelet al. (1995), a quick leaf prep described by Klimyuk et al. (1993), orusing FTA paper (Life Technologies).

For the latter procedure, a piece of plant tissue such as, for example,leaf tissue is excised from the plant, placed on top of the FTA paperand covered with a small piece of parafilm that serves as a barriermaterial to prevent contamination of the crushing device. In order todrive the sap and cells from the plant tissue into the FTA paper matrixfor effective cell lysis and nucleic acid entrapment, a crushing deviceis used to mash the tissue into the FTA paper. The FTA paper is airdried for an hour. For analysis of DNA, the samples can be archived onthe paper until analysis. Two mm punches are removed from the specimenarea on the FTA paper using a 2 mm Harris Micro Punch™ and placed intoPCR tubes. Two hundred (200) microliters of FTA purification reagent isadded to the tube containing the punch and vortexed at low speed for 2seconds. The tube is then incubated at room temperature for 5 minutes.The solution is removed with a pipette so as to repeat the wash one moretime. Two hundred (200) microliters of TE (10 mM Tris, 0.1 mM EDTA, pH8.0) is added and the wash is repeated two more times. The PCR mix isadded directly to the punch for subsequent PCR reactions.

6.2 Cloning of Candidate cDNA: A candidate cDNA is amplified from totalRNA isolated from rice tissue after reverse transcription using primersdesigned against the computationally predicted cDNA. Primers designedbased on the genomic sequence can be used to PCR amplify the full lengthcDNA (start to stop codon) from first strand cDNA prepared from ricecultivar Nipponbare tissue.

The Qiagen RNeasy kit (Qiagen, Hilden, Germany) is used for extractionof total RNA. The Superscript II kit (Onvitrogen, Carlsbad, USA) is usedfor the reverse transcription reaction. PCR amplification of thecandidate cDNA is carried out using the reverse primer sequence locatedat the translation start of the candidate gene in 5′-3′ direction. Thisis performed with high-fidelity Taq polymerase (Invitrogen, Carlsbad,USA).

The PCR fragment is then cloned into pCR2.1-TOPO (Invitrogen) or thepGEM-T easy vector (Promega Corporation, Madison, Wis., USA) per themanufacturer's instructions, and several individual clones are subjectedto sequencing analysis.

6.3 DNA sequencing: DNA preps for 2-4 independent clones are minipreppedfollowing the manufacturer's instructions (Qiagen). DNA is subjected tosequencing analysis using the BigDye™ Terminator Kit according tomanufacturer's instructions (AB). Sequencing makes use of primersdesigned to both strands of the predicted gene of interest. DNAsequencing is performed using standard dye-terminator sequencingprocedures and automated sequencers (models 373 and 377; AppliedBiosystems, Foster City, Calif.). All sequencing data are analyzed andassembled using the Phred/Phrap/Consed software package (University ofWashington) to an error ratio equal to or less than 10⁻⁴ at theconsensus sequence level.

The consensus sequence from the sequencing analysis is then to bevalidated as being intact and the correct gene in several ways. Thecoding region is checked for being full length (predicted start and stopcodons present) and uninterrupted (no internal stop codons). Alignmentwith the gene prediction and BLAST analysis is used to ascertain thatthis is in fact the right gene.

The clones are sequenced to verify their correct amplification.

Example 7 Functional Analysis in Plants

A plant complementation assay can be used for the functionalcharacterization of the grain filing genes according to the invention.

Rice and Arabidopsis putative orthologue pairs are identified usingBLAST comparisons, TFASTXY comparisons, and Double-Affine Smith-Watermansimilarity searches. Constructs containing a rice cDNA or genomic cloneinserted between the promoter and terminator of the Arabidopsisorthologue are generated using overlap PCR (Gene 77, 61-68 (1989)) andGATEWAY cloning (Life Technologies Invitrogen). For ease of cloning,rice cDNA clones are preferred to rice genomic clones. A three stage PCRstrategy is used to make these constructs.

(1) In the first stage, primers are used to PCR amplify: (i) 2 Kbupstream of the translation start site of the Arabidopsis orthologue,(ii) the coding region or cDNA of the rice orthologue, and (iii) the 500bp immediately downstream of the Arabidopsis orthogue's translation stopsite. Primers are designed to incorporate onto their 5′ ends at least 16bases of the 3′ end of the adjacent fragment, except in the case of themost distal primers which flank the gene construct (the forward primerof the promoter and the reverse primer of the terminator). The forwardprimer of the promoters contains on their 5′ ends partial AttB1 sites,and the reverse primer of the terminators contains on their 5′ endspartial AttB2 sites, for Gateway cloning.

(2) In the second stage, overlap PCR is used to join either the promoterand the coding region, or the coding region and the terminator.

(3) In the third stage either the promoter-coding region product can bejoined to the terminator or the coding regionterminator product can bejoined to the promoter, using overlap PCR and amplification with fullAtt site-containing primers, to link all three fragments, and put fullAtt sites at the construct termini.

The fused three-fragment piece flanked by Gateway cloning sites areintroduced into the LTI donor vector pDONR201 (Invitrogen) using the BPclonase reaction, for confirmation by sequencing. Confirmed sequencedconstructs are introduced into a binary vector containing Gatewaycloning sites, using the LR clonase reaction such as, for example,pAS200.

The pAS200 vector was created by inserting the Gateway cloning cassetteRfA into the Acc651 site of pNOV3510.

pNOV3510 was created by ligation of inverted pNOV2114 VSI binary intopNOV3507, a vector containing a PTX5′ Arab Protox promoter driving thePPO gene with the Nos terminator.

pNOV2114 was created by insertion of virN54D (Pazour et al. 1992, J.Bacteriol. 174:4169-4174) from pAD1289 (Hansen et al. 1994, PNAS91:7603-7607) into pHiNK085.

pHiNK085 was created by deleting the 35S:PMI cassette and M13 ori inpVictor HiNK.

pPVictor HiNK was created by modifying the T-DNA of pVictor (describedin WO 97/04112) to delete M13 derived sequences and to improve itscloning versatility by introducing the BIGLINK polylinker.

The sequence of the pVictor HiNK vector is disclosed in SEQ ID NO: 5 inWO 00/6837, which is incorporated herein by reference. The pVictor HiNKvector contains the following constituents that are of functionalimportance:

-   -   The origin of replication (OR1) functional in Agrobacterium is        derived from the Pseudomonas aeruginosa plasmid pVS1 (Itoh et        al. 1984. Plasmid 11: 206-220; Itoh and Haas, 1985. Gene 36:        27-36). The pVS1 OR1 is only functional in Agrobacterium and can        be mobilised by the helper plasmid pRK2013 from E. coli into A.        tumefaciens by means of a triparental mating procedure (Ditta et        al., 1980. Proc. Natl. Acad. Sci USA 77: 7347-7351).    -   The ColE1 origin of replication functional in E. coli is derived        from pUC19 (Yannisch Perron et al., 1985. Gene 33: 103-119).    -   The bacterial resistance to spectinomycin and streptomycin        encoded by a 0.93 kb fragment from transposon Tn7 (Fling et        al., 1985. Nucl. Acids Res. 13: 7095) functions as selectable        marker for maintenance of the vector in E. coli and        Agrobacterium. The gene is fused to the tac promoter for        efficient bacterial expression (Amman et al., 1983. Gene 25:        167-178).    -   The right and left T-DNA border fragments of 1.9 kb and 0.9 kb        that comprise the 24 bp border repeats, have been derived from        the Ti-plasmid of the nopaline type Agrobacterium tumefaciens        strains pTiT37 (Yadav et al., 1982. Proc. Natl. Acad. Sci. USA.        79: 6322-6326).

The plasmid is introduced into Agrobacterium tumefaciens GV3101 pMP90 byelectroporation. The positive bacterial transformants are selected on LBmedium containing 50 μg/μl kanamycin and 25 μg/μd gentamycin. Plants aretransformed by standard methodology (e.g., by dipping flowers into asolution containing the Agrobacterium) except that 0.02% Silwet—77(Lehle Seeds, Round Rock, Tex.) is added to the bacterial suspension andthe vacuum step omitted. Five hundred (500) mg of seeds are planted per2 ft² flat of soil and, and progeny seeds are selected for transformantsusing PPO selection.

Primary transformants are analyzed for complementation. Primarytransformants are genotyped for the Arabidopsis mutation and presence ofthe transgene. When possible, >50 mutants harboring the transgene shouldbe phenotyped to observe variation due to transgene copy number andexpression

Example 8 Vector Construction for Overexpression and Gene “Knockout”Experiments

8.1 Overexpression

Vectors used for expression of full-length “grain filling candidategenes” of interest in plants (overexpression) are designed tooverexpress the protein of interest and are of two general types,biolistic and binary, depending on the plant transformation method to beused.

For biolistic transformation (biolistic vectors), the requirements areas follows:

-   -   1. a backbone with a bacterial selectable marker (typically, an        antibiotic resistance gene) and origin of replication functional        in Escherichia coli (E. coli; eg. ColE1), and    -   2. a plant-specific portion consisting of        -   a. a gene expression cassette consisting of a promoter (eg.            ZmUBlint MOD), the gene of interest (typically, a            full-length cDNA) and a transcriptional terminator (eg.            Agrobacterium tumefaciens nos terminator);        -   b. a plant selectable marker cassette, consisting of a            promoter (eg. rice Act1D-BV MOD), selectable marker gene            (eg. phosphomannose isomerase, PMI) and transcriptional            terminator (eg. CaMV terminator).            Vectors designed for transformation by Agrobacterium            tumefaciens (A. tumefaciens; binary vectors) consist of:    -   1. a backbone with a bacterial selectable marker functional in        both E. coli and A. tumefaciens (eg. spectinomycin resistance        mediated by the aadA gene) and two origins of replication,        functional in each of aforementioned bacterial hosts, plus        the A. tumefaciens virG gene;    -   2. a plant-specific portion as described for biolistic vectors        above, except in this instance this portion is flanked by A.        tumefaciens right and left border sequences which mediate        transfer of the DNA flanked by these two sequences to the plant.

8.2 Knockout Vectors

Vectors designed for reducing or abolishing expression of a single geneor of a family or related genes (knockout vectors) are also of twogeneral types corresponding to the methodology used to downregulate geneexpression: antisense or double-stranded RNA interference (dsRNAi).

(a) Anti-Sense

For antisense vectors, a full-length or partial gene fragment(typically, a portion of the cDNA) can be used in the same vectorsdescribed for full-length expression, as part of the gene expressioncassette. For antisense-mediated down-regulation of gene expression, thecoding region of the gene or gene fragment will be in the oppositeorientation relative to the promoter, thus, mRNA will be made from thenon-coding (antisense) strand in planta.

(b) dsRNAi

For dsRNAi vectors, a partial gene fragment (typically, 300 to 500basepairs long) is used in the gene expression cassette, and isexpressed in both the sense and antisense orientations, separated by aspacer region (typically, a plant intron, eg. the OsSH1 intron 1, or aselectable marker, eg. conferring kanamycin resistance). Vectors of thistype are designed to form a double-stranded mRNA stem, resulting fromthe basepairing of the two complementary gene fragments in planta.

Biolistic or binary vectors designed for overexpression or knockout canvary in a number of different ways, including eg. the selectable markersused in plant and bacteria, the transcriptional terminators used in thegene expression and plant selectable marker cassettes, and themethodologies used for cloning in gene or gene fragments of interest(typically, conventional restriction enzyme-mediated or Gateway™recombinase-based cloning). An important variant is the nature of thegene expression cassette promoter driving expression of the gene or genefragment of interest in most tissues of the plants (constitutive, eg.ZmUBlint MOD), in specific plant tissues (eg. maize ADP-gpp forendosperm-specific expression), or in an inducible fashion (eg.GAL4bsBzl for estradiol-inducible expression in lines constitutivelyexpressing the cognate transcriptional activator for this promoter).

Example 9 Insertion of a “Grain Filling Candidate Gene” 1 intoExpression Vector

A validated rice cDNA clone in pCR2.1-TOPO or the pGEM-T easy vector issubcloned using conventional restriction enzyme-based cloning into avector, downstream of the maize ubiquitin promoter and intron, andupstream of the Agrobacterium tumefaciens nos 3′ end transcriptionalterminator. The resultant gene expression cassette (promoter, “grainfilling candidate gene” and terminator) is further subcloned, usingconventional restriction enzyme-based cloning, into the pNOV2117 binaryvector (Negrotto et al (2000) Plant Cell Reports 19, 798-803; plasmidpNOV117 discosed in this article corresponds to pNOV2117 describedherein; the nucleotide sequence of pNOV2117 is provided in SEQ ID NO: 44of WO 0173087), generating pNOVCAND.

The pNOVCAND binary vector is designed for transformation andover-expression of the “grain filling candidate gene” in monocots. Itconsists of a binary backbone containing the sequences necessary forselection and growth in Escherichia coli DH-5α (Invitrogen) andAgrobacterium tumefaciens LBA4404 (pAL4404; pSB1), including thebacterial spectinomycin antibiotic resistance aadA gene from E. colitransposon Tn7, origins of replication for E. coli (ColE1) and A.tumefaciens (VS1), and the A. tumefaciens virG gene. In addition to thebinary backbone, which is identical to that of pNOV2114 described hereinpreviously (see Example 7 above), pNOV2117 contains the T-DNA portionflanked by the right and left border sequences, and including thePositech™ (Syngenta) plant selectable marker (WO 94/20627) and the“grain filling candidate gene” gene expression cassette. The Positech™plant selectable marker confers resistance to mannose and in thisinstance consists of the maize ubiquitin promoter driving expression ofthe PMI (phosphomannose isomerase) gene, followed by the cauliflowermosaic virus transcriptional terminator.

Plasmid pNOV2117 is introduced into Agrobacterium tumefaciens LBA4404(pAL4404; pSB1) by electroporation. Plasmid pAL4404 is a disarmed helperplasmid (Ooms et al (1982) Plasmid 7, 15-29). Plasmid pSB1 is a plasmidwith a wide host range that contains a region of homology to pNOV2117and a 15.2 kb KpnI fragment from the virulence region of pTiBo542(Ishida et al (1996) Nat Biotechnol 14, 745-750). Introduction ofplasmid pNOV2117 into Agrobacterium strain LBA4404 results in aco-integration of pNOV2117 and pSB1.

Alternatively, plasmid pCIB7613, which contains the hygromycinphosphotransferase (hpt) gene (Gritz and Davies, Gene 25, 179-188, 1983)as a selectable marker, may be employed for transformation.

Plasmid pCIB7613 (see WO 98/06860, incorporated herein by reference inits entirety) is selected for rice transformation. In pCIB7613, thetranscription of the nucleic acid sequence codinghygromycin-phosphotransferase (HYG genc) is driven by the corn ubiquitinpromoter (ZmUbi) and enhanced by corn ubiquitin intron 1. The3′polyadenylation signal is provided by NOS 3′ nontranslated region.

Other useful plasmids include pNADII002 (GALA-ER-VP16) which containsthe yeast GAL4 DNA Binding domain (Keegan et al., Science, 231:699(1986)), the mammalian estrogen receptor ligand binding domain (Greeneet al., Science, 231 :1150 (1986)) and the transcriptional activationdomain of the HSV VP16 protein (Triezenberg et al., 1988). Both hpt andGAIA-ER-VP16 are constitutively expressed using the maize Ubiquitinpromoter, and pSGCDL1 (GAL4BS Bzl Luciferase), which carries the fireflyluciferase reporter gene under control of a minimal maize Bronzel (Bzl)promoter with 10 upstream synthetic GAL4 binding sites. All constructsuse termination signals from the nopaline synthase gene.

Example 10 Plant Transformation

10.1 Rice Transformation

pNOVCAND is transformed into a rice cultivar (Kaybonnet) usingAgrobacterium-mediated transformation, and mannose-resistant calli areselected and regenerated.

Agrobacterium is grown on YPC solid plates for 2-3 days prior toexperiment initiation. Agrobacterial colonies are suspended in liquid MSmedia to an OD of 0.2 at λ600 nm. Acetosyringone is added to theagrobacterial suspension to a concentration of 200 μM and agro isinduced for 30 min.

Three-week-old calli which are induced from the scutellum of matureseeds in the N6 medium (Chu, C. C. et al., Sci, Sin., 18, 659-668(1975))are incubated in the agrobacterium solution in a 100×25 petri plate for30 minutes with occasional shaking. The solution is then removed with apipet and the callus transfered to a MSAs medium which is overlayed withsterile filter paper.

Co-Cultivation is continued for 2 days in the dark at 22° C.

Calli are then placed on MS-Timetin plates for 1 week. After that theyare tranferred to PAA+ mannose selection media for 3 weeks.

Growing calli (putative events) are picked and transfered to PAA+mannose media and cultivated for 2 weeks in light.

Colonies are tranferred to MS20SorbKinTim regeneration media in platesfor 2 weeks in light. Small plantlets are transferred to MS20SorbKinTimregeneration media in GA7 containers. When they reach the lid, they aretransfered to soil in the greenhouse.

Expression of the “grain filling candidate gene” in transgenic To plantsis analyzed. Additional rice cultivars, such as but not limited to,Nipponbare, Taipei 309 and Fuzisaka 2 are also transformed and assayedfor expression of the “grain filling candidate gene” product andenhanced protein expression.

10.2 Maize Transformation

Transformation of immature maize embryos is performed essentially asdescribed in Negrotto et al., (2000) Plant Cell Reports 19: 798-803. Forthis example, all media constituents are as described in Negrotto etal., supra. However, various media constituents described in theliterature may be substituted.

1. Transformation Plasmids and Selectable Marker

The genes used for transformation are cloned into a vector suitable formaize transformation as described in Example 17. Vectors used containthe phosphomannose isomerase (PMI) gene (Negrotto et al. (2000) PlantCell Reports 19: 798-803).

2. Preparation of Agrobacterium tumefaciens

Agrobacterium strain LBA4404 (pSB1) containing the plant transformationplasmid is grown on YEP (yeast extract (5 g/L), peptone (10 g/L), NaCl(5 g/L), 15 g/l agar, pH 6.8) solid medium for 2 to 4 days at 28° C.Approximately 0.8×10⁹ Agrobacteria are suspended in LS-inf mediasupplemented with 100 μM acetosyringone (As) (Negrotto et al.,(2000)Plant Cell Rep 19: 798-803). Bacteria are pre-induced in this medium for30-60 minutes.

3. Inoculation

Immature embryos from A188 or other suitable maize genotypes are excisedfrom 8-12 day old ears into liquid LS-inf+100 μM As. Embryos are rinsedonce with fresh infection medium. Agrobacterium solution is then addedand embryos are vortexed for 30 seconds and allowed to settle with thebacteria for 5 minutes. The embryos are then transferred scutellum sideup to LSAs medium and cultured in the dark for two to three days.Subsequently, between 20 and 25 embryos per petri plate are transferredto LSDc medium supplemented with cefotaxime (250 mg/l) and silvernitrate (1.6 mg/l) and cultured in the dark for 28° C. for 10 days.

4. Selection of Transformed Cells and Regeneration of Transformed Plants

Immature embryos producing embryogenic callus are transferred toLSDIM0.5S medium. The cultures are selected on this medium for 6 weekswith a subculture step at 3 weeks. Surviving calli are transferredeither to LSDIM0.5S medium to be bulked-up or to Reg1 medium. Followingculturing in the light (16 hour light/8 hour dark regiment), greentissues are then transferred to Reg2 medium without growth regulatorsand incubated for 1-2 weeks. Plantlets are transferred to Magenta GA-7boxes (Magenta Corp, Chicago Ill.) containing Reg3 medium and grown inthe light. Plants that are PCR positive for the promoter-reportercassette are transferred to soil and grown in the greenhouse.

Example 11 Promoter Analysis

The gene chip experiment described above in Examples 3 and 4 aredesigned to uncover genes that are expressed in seed tissue during grainfilling. Candidate promoters are identified based upon the expressionprofiles of the associated transcripts representatives of which areprovided in SEQ ID NOs: 643-883.

Candidate promoters are obtained by PCR and fused to a GUS reporter genecontaining an intron. Both histocherical and fluometric GUS assays arecarried out on stably transformed rice and maize plants and GUS activityis detected in the transformants.

Further, transient assays with the promoter:GUS constructs are carriedout in rice embryogenic callus and GUS activity is detected byhistochemical staining according the protocol described below (seeExample 12).

Construction of Binary Promoter::Reporter Plasmids

To construct a binary promoter: reporter plasmid for rice transformationa vector containing a promoter of interest (i.e., the DNA sequence 5′ ofthe initiation codon for the gene of interest) is used, which resultsfrom recombination in a BP reaction between a PCR product using thepromoter of interest as a template and pDONR201™, producing an entryvector. The regulatory/promoter sequence is fused to the GUS reportergene (Jefferson et al, 1987) by recombination using GATEWAY™ Technologyaccording to manufacturers protocol as described in the InstructionManual (GATEWAY™ Cloning Technology, GIBCO BRL, Rockville, Md.http://www.lifetech.com/).

Briefly, the Gateway Gus-intronGus (GIG)/NOS expression cassette isligated into pNOV2117 binary vector in 5′ to 3′ orientation. The 4.1 kBexpression cassette is ligated into the Kpn-1 site of pNOV2117, thenclones are screened for orientation to obtain pNOV2346, a GATEWAY™adapted binary destination vector.

The promoter fragment in the entry vector is recombined via the LRreaction with the binary destination vector containing the GUS codingregion with an intron that has an attR site 5′ to the GUS reporter,producing a binary vector with a promoter fused to the GUS reporter(pNOVCANDProm). The orientation of the inserted fragment is maintainedby the alt sequences and the final construct is verified by sequencing.The construct is then transformed into Agrobacterium tumefaciens strainsby electroporation as described herein previously (see Example 9).

Example 12 Transient Expression Analysis of Candidate Promoters in RiceEmbryogenic Callus

Materials:

-   -   Embryopenic rice callus (Kaybonett cultivar)    -   LBA 4404 Agrobacterium strains    -   KCMS liquid media for re-suspending bacterial pellet    -   200 mM stock (40 mg/ml) Acetosyringone    -   Sterile filter paper discs (8.5 mm in diameter)    -   LB spec liquid culture    -   MS-CIM media plates    -   MS-AS plates (co-cultivation plates)    -   MS-Tim plates (recovery plates)    -   Gus staining solution        Methods:        Induction of Embryogenic Callus:    -   1. Sterilize mature Kayboneu rice seeds in 40% ultra Clorox, 1        drop Tween 20, for 40 min.    -   2. Rinse with sterile water and plate on MS-CIM media (12        seeds/plate)    -   3. Grow in dark for four weeks.    -   4. Isolate embryogenic calli from scutellum to MS-CIM    -   5. Let grow in dark 8 days before use for transformation        Agrobacterium Preparation and Induction:    -   1. Start 6 mL shaking cultures of LBA4404 Agrobacterium strains        harboring rice promoter binary plasmids.    -   2. Grow the cultures at room temperature for 48 hrs in the        rotary shaker.    -   3. Spin down the cultures at 8,000 rpm at 4° C. and re-suspend        bacterial pellets in 10 ml of KCMS media supplemented with 100?M        Acetosyringone.    -   4. Place in the shaker at room temp for 1 hr for induction of        Agrobacterium virulence ones.    -   5. In a sterile hood dilute Agrobacterium cultures 1:3 in KSMS        media and transfer diluted cultures into deep petri dishes.        Inoculation of Plant Material and Staining:    -   6. In a sterile hood transfer embryogenic callus into diluted        Agrobacerium solution and incubate for 30 minutes.    -   7. In a sterile hood blot callus tissue on sterile filter paper        and transfer on MS-AS plates.    -   8. Co-culture plates in 22° C. growth chamber in the dark for        two days.    -   9. In a sterile hood transfer callus tissue to MS-Tim plates for        the tissue recovery (the presence of Timentin will prevent        Agrobacterium growth).    -   10. Incubate tissue on MS-Tim media for two days at 22° C. in        the dark.    -   11. Remove callus tissue from the plates and stain for 48 hrs.        in GUS staining solution.    -   12. De-stain tissue in 70% EtOH for 24 hours.        Recipes:

KCMS media (liquid), pH to 5.5

-   -   100 ml/l MS Major Salts, 10 ml/l MS Minor Salts, 5 ml/l MS iron        stock, 0.5M K₂HPO₄, 0.1 mg/ml Myo-Inositol,    -   1.3 μg/ml Thiamine, 0.2 g/ml 2,4-D (1 mg/ml), 0.1 g/ml Kinetin,        3% Sucrose, 100?M Acetosyringo

MS-CIM media. pH 5.8

-   -   MS Basal salt (4.3 g/L), B5 Vitamins (200×) (5 m/L), 2% Sucrose        (20 g/L), Proline (500 mg/L), Glutamine (500 mg/L), Casein        Hydrolysate (300 mg/L), 2? g/ml 2,4-D, Phytagel (3 g/L)

MS-As Medium, pH 5.8

-   -   MS Basal salt (4.3 g/L), B5 Vitamins (200×) (5 m/L), 2% Sucrose        (20 g/L), Proline (500 mg/l), Glutamine (500 mg/L), Casein        Hydrolysate (300 mg/L), 2? g/ml 2,4-D, Phytagel (3 g/L), 200 ? M        Acetosyringone

MS-Tim media, pH 5.8

-   -   MS Basal salt (4.3 g/L), B5 Vitamins (200×) (5 m/L), 2% Sucrose        (20 g/L), Proline (500 mg/L), Glutamine (500 mg/L), Casein        Hydrolysate (300 mg(L), 2? g/ml 2,4-D, Phytagel (3 g/L), 400        mg/l Timentin

Gus staining solution, pH 7

-   -   0.3 M Mannitol; 0.02 M EDTA, pH=7.0; 0.04 NaH₂PO₄; 1 mM x-gluc        The binary Promoter::Reporter Plasmids described in Example 9        above can also be used for stable transformation of rice and        maize plants according to the protocols provided in Examples        10.1 and 10.2, respectively.

Example 13 Analysis of Mutant and transgenic Plant Material

Two tiers of assays are can be used for analysis of the mutant andtransgenic plant material.

-   -   Near InfraRed (NIR) spectrophometric analysis of seeds.    -   NIR enables evaluation of changes in starch, oil, protein and        fiber content at very high throughput (I sample/sec).    -   DIA or MRJ Imaging

DIA or MRI imaging allows observation of gross morphology and surfacearea of major seed tissues and compartments (embryo, aleurone,endosperm, seed coat). Transgenic lines can also be physically sectionedand directly observed for changes in seed compartment morphology.

Lines showing alterations in grain composition will be advanced to asecond tier of assays dependent upon the nature of the change detected:

-   -   1) Protein track: 1-D and 2-D protein gels Protein profiles    -   HPLC Amino acid profiles    -   DNTB or papain staining Protein redox status    -   GC N/C/S ratios    -   2) Starch track: Iodine staining Content, branching    -   Glucose-6-P analysis Phosphorylation level    -   3) Oils track: GC Oil, fatty acid profile

Example 14 Chromosomal Markers to Identify the Location of a NucleicAcid Sequence

The sequences of the present invention can also be used for SSR mapping.SSR mapping in rice has been described by Miyao et al. (DNA Res 3:233(1996)) and Yang et al. (Mol Gen Genet 245:187 (1994)), and in maize byAhn et al. (Mol Gen Genet 241:483 (1993)). SSR mapping can be achievedusing various methods. In one instance, polymorphisms are identifiedwhen sequence specific probes flanking an SSR contained within asequence are made and used in polymerase chain reaction (PCR) assayswith template DNA from two or more individuals or, in plants, nearisogenic lines. A change in the number of tandem repeats between theSSR-flanking sequence produces differently sized fragments (U.S. Pat.No. 5,766,847). Alternatively, polymorphisms can be identified by usingthe PCR fragment produced from the SSR-flanking sequence specific primerreaction as a probe against Southern blots representing differentindividuals (Refseth et al., Electrophoresis 18:1519 (1997)). Rice SSRscan be used to map a molecular marker closely linked to functional gene,as described by Akagi et al. (Genome 39:205 (1996)).

The sequences of the present invention can be used to identify anddevelop a variety of microsatellite markers, including the SSRsdescribed above, as genetic markers for comparative analysis and mappingof genomes.

Many of the polynucleotides listed in Tables 2 to 11 contain at least 3consecutive di-, tri- or tetranucleotide repeat units in their codingregion that can potentially be developed into SSR markers. Trinucleotidemotifs that can be commonly found in the coding regions of saidpolynucleotides and easily identified by screening the polynucleotidessequences for said motifs are, for example: CGG; GCC, CGC, GGC, etc.Once such a repeat unit has been found, primers can be designed whichare complementary to the region flanking the repeat unit and used in anyof the methods described below.

Sequences of the present invention can also be used in a variation ofthe SSR technique known as inter-SSR (ISSR), which uses microsatelliteoligonucleotides as primers to amplify genomic segments different fromthe repeat region itself (Zietkiewicz et al., Genomics 20:176 (1994)).ISSR employs oligonucleotides based on a simple sequence repeat anchoredor not at their 5′- or 3′-end by two to four arbitrarily chosennucleotides, which triggers site-specific annealing and initiates PCRamplification of genomic segments which are flanked by inverselyorientated and closely spaced repeat sequences. In one embodiment of thepresent invention, microsatellite markers as disclosed herein, orsubstantially similar sequences or allelic variants thereof, may be usedto detect the appearance or disappearance of markers indicating genomicinstability as described by Leroy et al. (Electron. J Biotechnol, 3(2),at http://www.ejb.org (2000)), where alteration of a fingerprintingpattern indicated loss of a marker corresponding to a part of a geneinvolved in the regulation of cell proliferation. Microsatellite markersare useful for detecting genomic alterations such as the change observedby Leroy et al. (Electron. J Biotechnol, 3(2), supra (2000)) whichappeared to be the consequence of microsatellite instability at theprimer binding site or modification of the region between themicrosatellites, and illustrated somaclonal variation leading to genomicinstability. Consequently, sequences of the present invention are usefulfor detecting genomic alterations involved in somaclonal variation,which is an important source of new phenotypes.

In addition, because the genomes of closely related species are largelysyntenic (that is, they display the same ordering of genes within thegenome), these maps can be used to isolate novel alleles from wildrelatives of crop species by positional cloning strategies. This sharedsynteny is very powerful for using genetic maps from one species to mapgenes in another. For example, a gene mapped in rice providesinformation for the gene location in maize and wheat.

Example 15 Quantitative Trait Linked Breeding

Various types of maps can be used with the sequences of the invention toidentify Quantitative Trait Loci (QTLs) for a variety of uses, includingmarker-assisted breeding. Many important crop traits are quantitativetraits and result from the combined interactions of several genes. Thesegenes reside at different loci in the genome, often on differentchromosomes, and generally exhibit multiple alleles at each locus.Developing markers, tools, and methods to identify and isolate the QTLsinvolved in a trait, enables marker-assisted breeding to enhancedesirable traits or suppress undesirable traits. The sequences disclosedherein can be used as markers for QTLs to assist marker-assistedbreeding. The sequences of the invention can be used to identify QTLsand isolate alleles as described by Li et al. in a study of QTLsinvolved in resistance to a pathogen of rice. (Li et al., Mol Gen Genet261:58 (1990)). In addition to isolating QTL alleles in rice, othercereals, and other monocot and dicot crop species, the sequences of theinvention can also be used to isolate alleles from the correspondingQTL(s) of wild relatives. Transgenic plants having various combinationsof QTL alleles can then be created and the effects of the combinationsmeasured. Once an ideal allele combination has been identified, cropimprovement can be accomplished either through biotechnological means orby directed conventional breeding programs. (Flowers et al., J Exp Bot51:99 (2000); Tanksley and McCouch, Science 277:1063 (1997)).

Example 16 Marker-Assisted Breeding

Markers or genes associated with specific desirable or undesirabletraits are known and used in marker assisted breeding programs. It isparticularly beneficial to be able to screen large numbers of markersand large numbers of candidate parental plants or progeny plants. Themethods of the invention allow high volume, multiplex screening fornumerous markers from numerous individuals simultaneously.

Markers or genes associated with specific desirable or undesirabletraits are known and used in marker assisted breeding programs. It isparticularly beneficial to be able to screen large numbers of markersand large numbers of candidate parental plants or progeny plants. Themethods of the invention allow high volume, multiplex screening fornumerous markers from numerous individuals simultaneously.

A multiplex assay is designed providing SSRs specific to each of themarkers of interest. The SSRs are linked to different classes of beads.All of the relevant markers may be expressed genes, so RNA or cDNAtechniques are appropriate. RNA is extracted from root tissue of 1000different individual plants and hybridized in parallel reactions withthe different classes of beads. Each class of beads is analyzed for eachsample using a microfluidics analyzer. For the classes of beadscorresponding to qualitative traits, qualitative measures of presence orabsence of the target gene are recorded. For the classes of beadscorresponding to quantitative traits, quantitative measures of geneactivity are recorded. Individuals showing activity of all of thequalitative genes and highest expression levels of the quantitativetraits are selected for further breeding steps. In procedures wherein noindividuals have desirable results for all the measured genes,individuals having the most desirable, and fewest undesirable, resultsare selected for further breeding steps. In either case, progeny arescreened to further select for homozygotes with high quantitative levelsof expression of the quantitative traits.

Example 17 Method of Modifying the Gene Frequency

The invention further provides a method of modifying the frequency of agene in a plant population, including the steps of: identifying an SSRwithin a coding region of a gene; screening a plurality of plants usingthe SSR as a marker to determine the presence or absence of the gene inan individual plant; selecting at least one individual plant forbreeding based on the presence or absence of the gene; and breeding atleast one plant thus selected to produce a population of plants having amodified frequency of the gene. The identification of the SSR within thecoding region of a gene can be accomplished based on sequence similaritybetween the nucleic acid molecules of the invention and the regionwithin the gene of interest flanking the SSR.

Supporting TABLES TABLE 12 This table illustrates the correlationbetween rice sequences in sub-groups I and III that show homologiesbetween 80% and 99.9% to each other Sub-Group II Sub-Group I SequencesSequences SEQ ID NO SEQ ID NO 513 121, 123 515 333 517 441; 443 519 151521 9 523 73 525 203 527 215 529 209 531 103 533 407 535 115 537 165 5391 541 325 543 397 545 61 547 455 549 255 551 351 553 225 555 139 557 25559 3 561 17 563 279 565 191 567 451 569 417 571 99; 95; 435 573 91; 81575 95; 99 577 85 579 229; 223 581 83 583 401; 235 585 283 587 179 589135 591 141 595 5 597 311 599 379 601 123; 121 603 335 605 287 607 161609 69 611 177 615 413 617 143 619 251 621 331 623 375 625 67 627 387629 81; 91 631 89 633 181 635 297 637 309 639 329 641 229 593, 613 221

TABLE 13 This table illustrates the correlation between rice sequencesin subgroups I and II 155 507 191 503 89 509 187 505 299 501 447 511

TABLE 14 Description of “Grain Filling” QTLs identified in Tables 2 and3 QTL: OS-AE-1-1 Species: Oryza sativa General Trait: DEVELOPMENTSpecific Trait: Allelopathic effect Citation: BREEDING SCIENCE (2001)51: 47-51 Chromosome: 1 Flanking Markers(s): QTL: OS-AE-11-1 Species:Oryza sativa General Trait: DEVELOPMENT Specific Trait: Allelopathiceffect Citation: BREEDING SCIENCE (2001) 51: 47-51 Chromosome: 11Flanking Markers(s): QTL: OS-AE-12-1 Species: Oryza sativa GeneralTrait: DEVELOPMENT Specific Trait: Allelopathic effect Citation:BREEDING SCIENCE (2001) 51: 47-51 Chromosome: 12 Flanking Markers(s):QTL: OS-AE-5-1 Species: Oryza sativa General Trait: DEVELOPMENT SpecificTrait: Allelopathic effect Citation: BREEDING SCIENCE (2001) 51: 47-51Chromosome: 5 Flanking Markers(s): QTL: OS-AMY-5-1 Species: Oryza sativaGeneral Trait: QUALITY Specific Trait: Amylose content Citation: THEORAPPL GENET (1999) 98: 502-508 Chromosome: 5 Flanking Markers(s): QTL:OS-AMY-6-1 Species: Oryza sativa General Trait: QUALITY Specific Trait:Amylose content Citation: THEOR APPL GENET (1999) 99: 642-648Chromosome: 6 Flanking Markers(s): QTL: OS-AMY-6-2 Species: Oryza sativaGeneral Trait: QUALITY Specific Trait: Amylose content Citation: THEORAPPL GENET (1999) 98: 502-508 Chromosome: 6 Flanking Markers(s): QTL:OS-APDF-9-1 Species: Oryza sativa General Trait: DEVELOPMENT SpecificTrait: Albino plantlet differentiation frequency Citation: MOLECULARBREEDING (1998) 4: 165-172 Chromosome: 9 Flanking Markers(s): QTL:OS-ASS-6-1 Species: Oryza sativa General Trait: QUALITY Specific Trait:Alkali spreading score Citation: THEOR APPL GENET (1999) 98: 502-508Chromosome: 6 Flanking Markers(s): QTL: OS-ASS-6-2 Species: Oryza sativaGeneral Trait: QUALITY Specific Trait: Alkali spreading score Citation:THEOR APPL GENET (1999) 98: 502-508 Chromosome: 6 Flanking Markers(s):QTL: OS-BDV-1-1 Species: Oryza sativa General Trait: QUALITY SpecificTrait: Breakdown viscosity Citation: THEOR APPL GENET (2000) 100:280-284 Chromosome: 1 Flanking Markers(s): QTL: OS-BDV-6-1 Species:Oryza sativa General Trait: QUALITY Specific Trait: Breakdown viscosityCitation: THEOR APPL GENET (2000) 100: 280-284 Chromosome: 6 FlankingMarkers(s): QTL: OS-CHALK-1-1 Species: Oryza sativa General Trait:QUALITY Specific Trait: Grain chalkiness Citation: THEOR APPL GENET(2000) 101: 823-829 Chromosome: 1 Flanking Markers(s): 0 QTL:OS-CHALK-10-1 Species: Oryza sativa General Trait: QUALITY SpecificTrait: Grain chalkiness Citation: THEOR APPL GENET (2000) 101: 823-829Chromosome: 10 Flanking Markers(s): 83.5 QTL: OS-CHALK-6-1 Species:Oryza sativa General Trait: QUALITY Specific Trait: Grain chalkinessCitation: THEOR APPL GENET (2000) 101: 823-829 Chromosome: 6 FlankingMarkers(s): 12.5 QTL: OS-CIF-6-1 Species: Oryza sativa General Trait:DEVELOPMENT Specific Trait: Callus induction frequency Citation:MOLECULAR BREEDING (1998) 4: 165-172 Chromosome: 6 Flanking Markers(s):QTL: OS-CPV-1-1 Species: Oryza sativa General Trait: QUALITY SpecificTrait: Cool paste viscosity Citation: THEOR APPL GENET (2000) 100:280-284 Chromosome: 1 Flanking Markers(s): QTL: OS-CPV-6-1 Species:Oryza sativa General Trait: QUALITY Specific Trait: Cool paste viscosityCitation: THEOR APPL GENET (2000) 100: 280-284 Chromosome: 6 FlankingMarkers(s): QTL: OS-CPV-6-2 Species: Oryza sativa General Trait: QUALITYSpecific Trait: Cool paste viscosity Citation: THEOR APPL GENET (2000)100: 280-284 Chromosome: 6 Flanking Markers(s): QTL: OS-CSV-1-1 Species:Oryza sativa General Trait: QUALITY Specific Trait: Consistencyviscosity Citation: THEOR APPL GENET (2000) 100: 280-284 Chromosome: 1Flanking Markers(s): QTL: OS-CSV-6-1 Species: Oryza sativa GeneralTrait: QUALITY Specific Trait: Consistency viscosity Citation: THEORAPPL GENET (2000) 100: 280-284 Chromosome: 6 Flanking Markers(s): QTL:OS-CSV-6-2 Species: Oryza sativa General Trait: QUALITY Specific Trait:Consistency viscosity Citation: THEOR APPL GENET (2000) 100: 280-284Chromosome: 6 Flanking Markers(s): QTL: OS-DM-6-1 Species: Oryza sativaGeneral Trait: YIELD Specific Trait: Dry Mass Citation: PLANT PHYSIOLOGY(2001) 125: 406-422 Chromosome: 6 Flanking Markers(s): 16.7 QTL:OS-FLLEN-3-1 Species: Oryza sativa General Trait: YIELD Specific Trait:Source-sink capacity Citation: MOLECULAR BREEDING (1998) 4: 419-426Chromosome: 2 Flanking Markers(s): 160 QTL: OS-FLLEN-9-1 Species: Oryzasativa General Trait: YIELD Specific Trait: Source-sink capacityCitation: MOLECULAR BREEDING (1998) 4: 419-426 Chromosome: 4 FlankingMarkers(s): QTL: OS-FLWID-3-1 Species: Oryza sativa General Trait: YIELDSpecific Trait: Source-sink capacity Citation: MOLECULAR BREEDING (1998)4: 419-426 Chromosome: 8 Flanking Markers(s): QTL: OS-GC-2-1 Species:Oryza sativa General Trait: QUALITY Specific Trait: Gel consistencyCitation: THEOR APPL GENET (1999) 98: 502-508 Chromosome: 2 FlankingMarkers(s): QTL: OS-GC-6-1 Species: Oryza sativa General Trait: QUALITYSpecific Trait: Gel consistency Citation: THEOR APPL GENET (1999) 99:642-648 Chromosome: 6 Flanking Markers(s): QTL: OS-GP-1-1 Species: Oryzasativa General Trait: YIELD Specific Trait: Grains per panicle Citation:THEOR APPL GENET (2000) 101: 248-254 Chromosome: 1 Flanking Markers(s):QTL: OS-GP-6-1 Species: Oryza sativa General Trait: YIELD SpecificTrait: Grains per panicle Citation: THEOR APPL GENET (2000) 101: 248-254Chromosome: 6 Flanking Markers(s): QTL: OS-GPDF-1-1 Species: Oryzasativa General Trait: DEVELOPMENT Specific Trait: Green plantletdifferentiation frequency Citation: MOLECULAR BREEDING (1998) 4: 165-172Chromosome: 1 Flanking Markers(s): QTL: OS-GPL-1-1 Species: Oryza sativaGeneral Trait: YIELD Specific Trait: Grains per plant Citation: GENETICS(1998) 150: 899-909 Chromosome: 1 Flanking Markers(s): QTL: OS-GPL-2-1Species: Oryza sativa General Trait: YIELD Specific Trait: Grains perplant Citation: GENETICS (1998) 150: 899-909 Chromosome: 2 FlankingMarkers(s): QTL: OS-GPL-4-1 Species: Oryza sativa General Trait: YIELDSpecific Trait: Grains per plant Citation: GENETICS (1998) 150: 899-909Chromosome: 4 Flanking Markers(s): QTL: OS-GPL-8-2 Species: Oryza sativaGeneral Trait: YIELD Specific Trait: Grains per plant Citation: GENETICS(1998) 150: 899-909 Chromosome: 8 Flanking Markers(s): QTL: OS-GPP-4-1Species: Oryza sativa General Trait: YIELD Specific Trait: Grains perpanicle Citation: GENETICS (1998) 150: 899-909 Chromosome: 4 FlankingMarkers(s): QTL: OS-GPP-8-2 Species: Oryza sativa General Trait: YIELDSpecific Trait: Grains per panicle Citation: GENETICS (1998) 150:899-909 Chromosome: 8 Flanking Markers(s): QTL: OS-GPYF-1-1 Species:Oryza sativa General Trait: DEVELOPMENT Specific Trait: Green plantletyield frequency Citation: MOLECULAR BREEDING (1998) 4: 165-172Chromosome: 1 Flanking Markers(s): QTL: OS-GW-1-2 Species: Oryza sativaGeneral Trait: YIELD Specific Trait: 1000 grain weight Citation: THEORAPPL GENET (2001) 102: 41-52 Chromosome: 1 Flanking Markers(s): QTL:OS-GW-3-1 Species: Oryza sativa General Trait: YIELD Specific Trait:Grain weight - 1000 grains Citation: GENETICS (1998) 150: 899-909Chromosome: 3 Flanking Markers(s): QTL: OS-GW-3-1 Species: Oryza sativaGeneral Trait: YIELD Specific Trait: Grain weight Citation: THEOR APPLGENET (2000) 101: 248-254 Chromosome: 3 Flanking Markers(s): QTL:OS-GW-3-1 Species: Oryza sativa General Trait: YIELD Specific Trait:1000 grain weight Citation: THEOR APPL GENET (2001) 102: 41-52Chromosome: 3 Flanking Markers(s): QTL: OS-GW-5-1 Species: Oryza sativaGeneral Trait: YIELD Specific Trait: Grain weight - 1000 grainsCitation: GENETICS (1998) 150: 899-909 Chromosome: 5 FlankingMarkers(s): QTL: OS-GW-5-1 Species: Oryza sativa General Trait: YIELDSpecific Trait: Grain weight Citation: THEOR APPL GENET (2000) 101:248-254 Chromosome: 5 Flanking Markers(s): QTL: OS-GW-9-1 Species: Oryzasativa General Trait: YIELD Specific Trait: Grain weight - 1000 grainsCitation: GENETICS (1998) 150: 899-909 Chromosome: 9 FlankingMarkers(s): QTL: OS-GW100-4-1 Species: Oryza sativa General Trait: YIELDSpecific Trait: Grain weight - 100 grains Citation: THEOR APPL GENET(1998) 96: 957-963 Chromosome: 4 Flanking Markers(s): 100 QTL:OS-GYLD-1-1 Species: Oryza sativa General Trait: YIELD Specific Trait:Grain yield - tons/ha Citation: GENETICS (1998) 150: 899-909 Chromosome:1 Flanking Markers(s): QTL: OS-GYLD-2-1 Species: Oryza sativa GeneralTrait: YIELD Specific Trait: Grain yield - tons/ha Citation: GENETICS(1998) 150: 899-909 Chromosome: 2 Flanking Markers(s): QTL: OS-GYLD-4-1Species: Oryza sativa General Trait: YIELD Specific Trait: Grain yield -tons/ha Citation: GENETICS (1998) 150: 899-909 Chromosome: 4 FlankingMarkers(s): QTL: OS-GYLD-8-2 Species: Oryza sativa General Trait: YIELDSpecific Trait: Grain yield - tons/ha Citation: GENETICS (1998) 150:899-909 Chromosome: 8 Flanking Markers(s): QTL: OS-HPV-6-1 Species:Oryza sativa General Trait: QUALITY Specific Trait: Hot paste viscosityCitation: THEOR APPL GENET (2000) 100: 280-284 Chromosome: 6 FlankingMarkers(s): QTL: OS-HPV-6-2 Species: Oryza sativa General Trait: QUALITYSpecific Trait: Hot paste viscosity Citation: THEOR APPL GENET (2000)100: 280-284 Chromosome: 6 Flanking Markers(s): QTL: OS-PGWC-8-1Species: Oryza sativa General Trait: QUALITY Specific Trait: Percentageof grain with white core Citation: THEOR APPL GENET (1999) 98: 502-508Chromosome: 8 Flanking Markers(s): QTL: OS-REGEN-3-1 Species: Oryzasativa General Trait: DEVELOPMENT Specific Trait: Regeneration abilityCitation: THEOR APPL GENET (1999) 98: 243-251 Chromosome: 3 FlankingMarkers(s): 9 QTL: OS-RGT-2-1 Species: Oryza sativa General Trait:DEVELOPMENT Specific Trait: Reproductive growth time Citation: THEORAPPL GENET (2001) 102: 1236-1242 Chromosome: 2 Flanking Markers(s): QTL:OS-RGT-5-1 Species: Oryza sativa General Trait: DEVELOPMENT SpecificTrait: Reproductive growth time Citation: THEOR APPL GENET (2001) 102:1236-1242 Chromosome: 5 Flanking Markers(s): QTL: OS-SBV-1-1 Species:Oryza sativa General Trait: QUALITY Specific Trait: Setback viscosityCitation: THEOR APPL GENET (2000) 100: 280-284 Chromosome: 1 FlankingMarkers(s): QTL: OS-SBV-6-1 Species: Oryza sativa General Trait: QUALITYSpecific Trait: Setback viscosity Citation: THEOR APPL GENET (2000) 100:280-284 Chromosome: 6 Flanking Markers(s): QTL: OS-VGT-2-1 Species:Oryza sativa General Trait: DEVELOPMENT Specific Trait: Vegetativegrowth time Citation: THEOR APPL GENET (2001) 102: 1236-1242 Chromosome:2 Flanking Markers(s): QTL: OS-VGT-2-2 Species: Oryza sativa GeneralTrait: DEVELOPMENT Specific Trait: Vegetative growth time Citation:THEOR APPL GENET (2001) 102: 1236-1242 Chromosome: 2 FlankingMarkers(s): QTL: OS-VGT-5-1 Species: Oryza sativa General Trait:DEVELOPMENT Specific Trait: Vegetative growth time Citation: THEOR APPLGENET (2001) 102: 1236-1242 Chromosome: 5 Flanking Markers(s): QTL:OS-VGT-9-1 Species: Oryza sativa General Trait: DEVELOPMENT SpecificTrait: Vegetative growth time Citation: THEOR APPL GENET (2001) 102:1236-1242 Chromosome: 9 Flanking Markers(s): QTL: OS-WC-6-1 Species:Oryza sativa General Trait: QUALITY Specific Trait: Grain white coreCitation: THEOR APPL GENET (2000) 101: 823-829 Chromosome: 6 FlankingMarkers(s): 13.5 QTL: OS-Y-6-1 Species: Oryza sativa General Trait:YIELD Specific Trait: Yield Citation: THEOR APPL GENET (2000) 101:248-254 Chromosome: 6 Flanking Markers(s): QTL: OS-YLD-1-1 Species:Oryza sativa General Trait: YIELD Specific Trait: Yield Citation: THEORAPPL GENET (2001) 102: 41-52 Chromosome: 1 Flanking Markers(s): QTL:OS-YLD-5-1 Species: Oryza sativa General Trait: YIELD Specific Trait:Yield Citation: THEOR APPL GENET (2001) 102: 793-800 Chromosome: 5Flanking Markers(s): QTL: ZM-BIOM-3-1 Species: Zea mays General Trait:YIELD Specific Trait: “Biomass, above ground” Citation: THEOR APPL GENET(1999) 99: 1106-1119 Chromosome: 3 Flanking Markers(s): “UMC3, UMC96”QTL: ZM-BIOM-5-1 Species: Zea mays General Trait: YIELD Specific Trait:“Biomass, above ground” Citation: THEOR APPL GENET (1999) 99: 1106-1119Chromosome: 5 Flanking Markers(s): UMC166 QTL: ZM-BIOM-7-1 Species: Zeamays General Trait: YIELD Specific Trait: “Biomass, above ground”Citation: THEOR APPL GENET (1999) 99: 1106-1119 Chromosome: 7 FlankingMarkers(s): “UMC116, BNL14.07” QTL: ZM-BIOM-8-1 Species: Zea maysGeneral Trait: YIELD Specific Trait: “Biomass, above ground” Citation:THEOR APPL GENET (1999) 99: 1106-1119 Chromosome: 8 Flanking Markers(s):“UMC138L, UMC12” QTL: ZM-CL-9-1 Species: Zea mays General Trait: QUALITYSpecific Trait: Cellulose content Citation: THEOR APPL GENET (2001) 102:591-599 Chromosome: 9 Flanking Markers(s): QTL: ZM-CPC-1-2 Species: Zeamays General Trait: QUALITY Specific Trait: Crude protein concentrationCitation: CROP SCI (1998) 38: 1278-1289 Chromosome: 1 FlankingMarkers(s): UMC76 QTL: ZM-CPC-1-2 Species: Zea mays General Trait:QUALITY Specific Trait: Crude protein content Citation: CROP SCI (2001)41: 690-697 Chromosome: 1 Flanking Markers(s): 224 QTL: ZM-CPC-1-3Species: Zea mays General Trait: QUALITY Specific Trait: Crude proteinconcentration Citation: CROP SCI (1998) 38: 1278-1289 Chromosome: 1Flanking Markers(s): UMC58 QTL: ZM-CPC-1-4 Species: Zea mays GeneralTrait: QUALITY Specific Trait: Crude protein concentration Citation:CROP SCI (1998) 38: 1278-1289 Chromosome: 1 Flanking Markers(s): UMC128QTL: ZM-CPC-1-5 Species: Zea mays General Trait: QUALITY Specific Trait:Crude protein concentration Citation: CROP SCI (1998) 38: 1278-1289Chromosome: 1 Flanking Markers(s): UMC67 QTL: ZM-CPC-1-6 Species: Zeamays General Trait: QUALITY Specific Trait: Crude protein concentrationCitation: CROP SCI (1998) 38: 1278-1289 Chromosome: 1 FlankingMarkers(s): UMC83 QTL: ZM-CPC-10-1 Species: Zea mays General Trait:QUALITY Specific Trait: Crude protein concentration Citation: CROP SCI(1998) 38: 1278-1289 Chromosome: 10 Flanking Markers(s): UMC130 QTL:ZM-CPC-3-1 Species: Zea mays General Trait: QUALITY Specific Trait:Crude protein concentration Citation: CROP SCI (1998) 38: 1278-1289Chromosome: 3 Flanking Markers(s): UMC154 QTL: ZM-CPC-3-2 Species: Zeamays General Trait: QUALITY Specific Trait: Crude protein concentrationCitation: CROP SCI (1998) 38: 1278-1289 Chromosome: 3 FlankingMarkers(s): BNL1.297 QTL: ZM-CPC-3-3 Species: Zea mays General Trait:QUALITY Specific Trait: Crude protein concentration Citation: CROP SCI(1998) 38: 1278-1289 Chromosome: 3 Flanking Markers(s): UMC10 QTL:ZM-CPC-5-1 Species: Zea mays General Trait: QUALITY Specific Trait:Crude protein concentration Citation: CROP SCI (1998) 38: 1278-1289Chromosome: 5 Flanking Markers(s): BNL6.22 QTL: ZM-CPC-6-2 Species: Zeamays General Trait: QUALITY Specific Trait: Crude protein concentrationCitation: CROP SCI (1998) 38: 1278-1289 Chromosome: 6 FlankingMarkers(s): UMC85 QTL: ZM-CPC-7-2 Species: Zea mays General Trait:QUALITY Specific Trait: Crude protein concentration Citation: CROP SCI(1998) 38: 1278-1289 Chromosome: 7 Flanking Markers(s): UMC98B QTL:ZM-CPC-7-3 Species: Zea mays General Trait: QUALITY Specific Trait:Crude protein concentration Citation: CROP SCI (1998) 38: 1278-1289Chromosome: 7 Flanking Markers(s): UMC56 QTL: ZM-CPC-8-1 Species: Zeamays General Trait: QUALITY Specific Trait: Crude protein concentrationCitation: CROP SCI (1998) 38: 1278-1289 Chromosome: 8 FlankingMarkers(s): UMC71 QTL: ZM-DMC-1-1 Species: Zea mays General Trait: YIELDSpecific Trait: Dry matter concentration Citation: CROP SCI (1998) 38:1278-1289 Chromosome: 1 Flanking Markers(s): UMC33 QTL: ZM-DMC-1-1Species: Zea mays General Trait: YIELD Specific Trait: Dry matterconcentration Citation: THEOR APPL GENET (2001) 102: 230-243 Chromosome:1 Flanking Markers(s): QTL: ZM-DMC-1-2 Species: Zea mays General Trait:YIELD Specific Trait: Dry matter concentration Citation: CROP SCI (1998)38: 1278-1289 Chromosome: 1 Flanking Markers(s): UMC128 QTL: ZM-DMC-10-1Species: Zea mays General Trait: YIELD Specific Trait: Dry matterconcentration Citation: CROP SCI (1998) 38: 1278-1289 Chromosome: 10Flanking Marker(s): UMC146 QTL: ZM-DMC-10-1 Species: Zea mays GeneralTrait: YIELD Specific Trait: Dry matter concentration Citation: THEORAPPL GENET (2001) 102: 230-243 Chromosome: 10 Flanking Markers(s): QTL:ZM-DMC-10-2 Species: Zea mays General Trait: YIELD Specific Trait: Drymatter concentration Citation: CROP SCI (1998) 38: 1278-1289 Chromosome:10 Flanking Markers(s): UMC146 QTL: ZM-DMC-2-3 Species: Zea mays GeneralTrait: YIELD Specific Trait: Dry matter concentration Citation: THEORAPPL GENET (2001) 102: 230-243 Chromosome: 2 Flanking Markers(s): QTL:ZM-DMC-5-1 Species: Zea mays General Trait: YIELD Specific Trait: Drymatter concentration Citation: CROP SCI (1998) 38: 1278-1289 Chromosome:5 Flanking Markers(s): UMC68 QTL: ZM-DMC-5-1 Species: Zea mays GeneralTrait: YIELD Specific Trait: Dry matter content Citation: CROP SCI(2001) 41: 690-697 Chromosome: 5 Flanking Markers(s): 116 QTL:ZM-DMC-5-1 Species: Zea mays General Trait: YIELD Specific Trait: Drymatter concentration Citation: THEOR APPL GENET (2001) 102: 230-243Chromosome: 5 Flanking Markers(s): QTL: ZM-DMC-6-1 Species: Zea maysGeneral Trait: YIELD Specific Trait: Dry matter concentration Citation:CROP SCI (1998) 38: 1278-1289 Chromosome: 6 Flanking Markers(s): UMC85QTL: ZM-DMC-6-1 Species: Zea mays General Trait: YIELD Specific Trait:Dry matter concentration Citation: THEOR APPL GENET (2001) 102: 230-243Chromosome: 6 Flanking Markers(s): QTL: ZM-DMC-6-2 Species: Zea maysGeneral Trait: YIELD Specific Trait: Dry matter concentration Citation:CROP SCI (1998) 38: 1278-1289 Chromosome: 6 Flanking Marker(s): UMC59QTL: ZM-DMC-8-1 Species: Zea mays General Trait: YIELD Specific Trait:Dry matter concentration Citation: CROP SCI (1998) 38: 1278-1289Chromosome: 8 Flanking Markers(s): UMC117 QTL: ZM-DMC-8-1 Species: Zeamays General Trait: YIELD Specific Trait: Dry matter content Citation:CROP SCI (2001) 41: 690-697 Chromosome: 8 Flanking Markers(s): 132 QTL:ZM-DMC-8-1 Species: Zea mays General Trait: YIELD Specific Trait: Drymatter concentration Citation: THEOR APPL GENET (2001) 102: 230-243Chromosome: 8 Flanking Markers(s): QTL: ZM-DMC-8-2 Species: Zea maysGeneral Trait: YIELD Specific Trait: Dry matter concentration Citation:CROP SCI (1998) 38: 1278-1289 Chromosome: 8 Flanking Markers(s): UMC71QTL: ZM-DMC-8-2 Species: Zea mays General Trait: YIELD Specific Trait:Dry matter content Citation: CROP SCI (2001) 41: 690-697 Chromosome: 8Flanking Markers(s): 176 QTL: ZM-DMY-1-2 Species: Zea mays GeneralTrait: YIELD Specific Trait: Dry matter yield Citation: CROP SCI (1998)38: 1278-1289 Chromosome: 1 Flanking Markers(s): UMC167 QTL: ZM-DMY-1-3Species: Zea mays General Trait: YIELD Specific Trait: Dry matter yieldCitation: CROP SCI (1998) 38: 1278-1289 Chromosome: 1 FlankingMarkers(s): UMC83A QTL: ZM-DMY-1-4 Species: Zea mays General Trait:YIELD Specific Trait: Dry matter yield Citation: CROP SCI (1998) 38:1278-1289 Chromosome: 1 Flanking Markers(s): BNL5.59 QTL: ZM-DMY-1-5Species: Zea mays General Trait: YIELD Specific Trait: Dry matter yieldCitation: CROP SCI (1998) 38: 1278-1289 Chromosome: 1 FlankingMarkers(s): UMC83 QTL: ZM-DMY-10-1 Species: Zea mays General Trait:YIELD Specific Trait: Dry matter yield Citation: CROP SCI (1998) 38:1278-1289 Chromosome: 10 Flanking Markers(s): UMC64 QTL: ZM-DMY-10-1Species: Zea mays General Trait: YIELD Specific Trait: Dry matter yieldCitation: CROP SCI (2001) 41: 690-697 Chromosome: 10 FlankingMarkers(s): 56 QTL: ZM-DMY-2-1 Species: Zea mays General Trait: YIELDSpecific Trait: Dry matter yield Citation: CROP SCI (1998) 38: 1278-1289Chromosome: 2 Flanking Markers(s): UMC53 QTL: ZM-DMY-2-3 Species: Zeamays General Trait: YIELD Specific Trait: Dry matter yield Citation:CROP SCI (1998) 38: 1278-1289 Chromosome: 2 Flanking Markers(s): UMC4QTL: ZM-DMY-2-4 Species: Zea mays General Trait: YIELD Specific Trait:Dry matter yield Citation: CROP SCI (1998) 38: 1278-1289 Chromosome: 2Flanking Markers(s): UMC36 QTL: ZM-DMY-3-1 Species: Zea mays GeneralTrait: YIELD Specific Trait Dry matter yield Citation: CROP SCI (1998)38: 1278-1289 Chromosome: 3 Flanking Markers(s): BNL6.16 QTL: ZM-DMY-3-2Species: Zea mays General Trait: YIELD Specific Trait: Dry matter yieldCitation: CROP SCI (1998) 38: 1278-1289 Chromosome: 3 FlankingMarkers(s): UMC154 QTL: ZM-DMY-3-3 Species: Zea mays General Trait:YIELD Specific Trait: Dry matter yield Citation: CROP SCI (1998) 38:1278-1289 Chromosome: 3 Flanking Markers(s): UMC10 QTL: ZM-DMY-4-1Species: Zea mays General Trait: YIELD Specific Trait: Dry matter yieldCitation: CROP SCI (1998) 38: 1278-1289 Chromosome: 4 FlankingMarkers(s): UMC31 QTL: ZM-DMY-4-2 Species: Zea mays General Trait: YIELDSpecific Trait: Dry matter yield Citation: CROP SCI (1998) 38: 1278-1289Chromosome: 4 Flanking Markers(s): BNL7.65 QTL: ZM-DMY-4-3 Species: Zeamays General Trait: YIELD Specific Trait: Dry matter yield Citation:CROP SCI (1998) 38: 1278-1289 Chromosome: 4 Flanking Markers(s): UMC42QTL: ZM-DMY-4-4 Species: Zea mays General Trait: YIELD Specific Trait:Dry matter yield Citation: CROP SCI (1998) 38: 1278-1289 Chromosome: 4Flanking Markers(s): UMC127B QTL: ZM-DMY-5-1 Species: Zea mays GeneralTrait: YIELD Specific Trait: Dry matter yield Citation: CROP SCI (1998)38: 1278-1289 Chromosome: 5 Flanking Markers(s): BNL7.71 QTL: ZM-DMY-8-1Species: Zea mays General Trait: YIELD Specific Trait: Dry matter yieldCitation: CROP SCI (1998) 38: 1278-1289 Chromosome: 8 FlankingMarkers(s): UMC120 QTL: ZM-DMY-8-1 Species: Zea mays General Trait:YIELD Specific Trait: Dry matter yield Citation: CROP SCI (2001) 41:690-697 Chromosome: 8 Flanking Markers(s): 172 QTL: ZM-DMY-8-2 Species:Zea mays General Trait: YIELD Specific Trait: Dry matter yield Citation:CROP SCI (1998) 38: 1278-1289 Chromosome: 8 Flanking Markers(s): UMC12AQTL: ZM-DMY-9-1 Species: Zea mays General Trait: YIELD Specific Trait:Dry matter yield Citation: CROP SCI (1998) 38: 1278-1289 Chromosome: 9Flanking Markers(s): UMC95 QTL: ZM-EWT-2-1 Species: Zea mays GeneralTrait: YIELD Specific Trait: Ear weight Citation: THEOR APPL GENET(1999) 99: 280-288 Chromosome: 2 Flanking Markers(s): PHI083 QTL:ZM-EWT-4-2 Species: Zea mays General Trait: YIELD Specific Trait: Earweight Citation: THEOR APPL GENET (1999) 99: 280-288 Chromosome: 4Flanking Markers(s): PHI093 QTL: ZM-GWE-9-1 Species: Zea mays GeneralTrait: YIELD Specific Trait: Grain weight per ear Citation: THEOR APPLGENET (2001) 102: 591-599 Chromosome: 9 Flanking Markers(s): QTL:ZM-GWM2-1-1 Species: Zea mays General Trait: YIELD Specific Trait:“Yield, grain weight per square meter” Citation: THEOR APPL GENET (1999)99: 1106-1119 Chromosome: 1 Flanking Markers(s): “UMC163, UMC161” QTL:ZM-GWM2-10-1 Species: Zea mays General Trait: YIELD Specific Trait:“Yield, grain weight per square meter” Citation: THEOR APPL GENET (1999)99: 1106-1119 Chromosome: 10 Flanking Markers(s): “UMC146, UMC44” QTL:ZM-GWM2-3-1 Species: Zea mays General Trait: YIELD Specific Trait:“Yield, grain weight per square meter” Citation: THEOR APPL GENET (1999)99: 1106-1119 Chromosome: 3 Flanking Markers(s): “UMC92, UMC10” QTL:ZM-GWM2-3-2 Species: Zea mays General Trait: YIELD Specific Trait:“Yield, grain weight per square meter” Citation: THEOR APPL GENET (1999)99: 1106-1119 Chromosome: 3 Flanking Markers(s): “UMC3, UMC96” QTL:ZM-GWM2-7-1 Species: Zea mays General Trait: YIELD Specific Trait:“Yield, grain weight per square meter” Citation: THEOR APPL GENET (1999)99: 1106-1119 Chromosome: 7 Flanking Markers(s): “BNL15.40, UMC116” QTL:ZM-GYHA-1-1 Species: Zea mays General Trait: YIELD Specific Trait: Grainyield per hectare Citation: CROP SCI (1998) 38: 1296-1308 Chromosome: 1Flanking Markers(s): QTL: ZM-GYHA-1-2 Species: Zea mays General Trait:YIELD Specific Trait: Grain yield per hectare Citation: CROP SCI (1998)38: 1296-1308 Chromosome: 1 Flanking Markers(s): QTL: ZM-GYHA-1-3Species: Zea mays General Trait: YIELD Specific Trait: Grain yield perhectare Citation: CROP SCI (1998) 38: 1296-1308 Chromosome: 1 FlankingMarkers(s): QTL: ZM-GYHA-1-4 Species: Zea mays General Trait: YIELDSpecific Trait: Grain yield per hectare Citation: CROP SCI (1998) 38:1296-1308 Chromosome: 1 Flanking Markers(s): QTL: ZM-GYHA-3-1 Species:Zea mays General Trait: YIELD Specific Trait: Grain yield per hectareCitation: CROP SCI (1998) 38: 1296-1308 Chromosome: 3 FlankingMarkers(s): QTL: ZM-GYHA-5-1 Species: Zea mays General Trait: YIELDSpecific Trait: Grain yield per hectare Citation: CROP SCI (1998) 38:1296-1308 Chromosome: 5 Flanking Markers(s): QTL: ZM-GYHA-6-1 Species:Zea mays General Trait: YIELD Specific Trait: Grain yield per hectareCitation: CROP SCI (1998) 38: 1296-1308 Chromosome: 6 FlankingMarkers(s): QTL: ZM-GYHA-8-1 Species: Zea mays General Trait: YIELDSpecific Trait: Grain yield per hectare Citation: CROP SCI (1998) 38:1296-1308 Chromosome: 8 Flanking Markers(s): QTL: ZM-GYLD-1-1 Species:Zea mays General Trait: YIELD Specific Trait: Grain yield Citation: CROPSCI (2000) 40: 30-39 Chromosome: 1 Flanking Markers(s): QTL: ZM-GYLD-1-2Species: Zea mays General Trait: YIELD Specific Trait: Grain yieldCitation: CROP SCI (2000) 40: 30-39 Chromosome: 1 Flanking Markers(s):QTL: ZM-GYLD-2-1 Species: Zea mays General Trait: YIELD Specific Trait:Grain yield Citation: PLANT BREEDING (1998) 117: 193-202 Chromosome: 2Flanking Markers(s): “CDOCMT202, CSU75C” QTL: ZM-GYLD-2-2 Species: Zeamays General Trait: YIELD Specific Trait: Grain yield Citation: CROP SCI(2000) 40: 30-39 Chromosome: 2 Flanking Markers(s): QTL: ZM-GYLD-2-3Species: Zea mays General Trait: YIELD Specific Trait: Grain yieldCitation: CROP SCI (2000) 40: 30-39 Chromosome: 2 Flanking Markers(s):QTL: ZM-GYLD-2-4 Species: Zea mays General Trait: YIELD Specific Trait:Grain yield Citation: CROP SCI (2000) 40: 30-39 Chromosome: 2 FlankingMarkers(s): QTL: ZM-GYLD-3-3 Species: Zea mays General Trait: YIELDSpecific Trait: Grain yield Citation: CROP SCI (2000) 40: 30-39Chromosome: 3 Flanking Markers(s): QTL: ZM-GYLD-4-1 Species: Zea maysGeneral Trait: YIELD Specific Trait: Grain yield Citation: CROP SCI(2000) 40: 30-39 Chromosome: 4 Flanking Markers(s): QTL: ZM-GYLD-5-1Species: Zea mays General Trait: YIELD Specific Trait: Grain yieldCitation: CROP SCI (2000) 40: 30-39 Chromosome: 5 Flanking Markers(s):QTL: ZM-GYLD-5-2 Species: Zea mays General Trait: YIELD Specific Trait:Grain yield Citation: CROP SCI (2000) 40: 30-39 Chromosome: 5 FlankingMarkers(s): QTL: ZM-GYLD-5-3 Species: Zea mays General Trait: YIELDSpecific Trait: Grain yield Citation: CROP SCI (2000) 40: 30-39Chromosome: 5 Flanking Markers(s): QTL: ZM-GYLD-6-1 Species: Zea maysGeneral Trait: YIELD Specific Trait: Grain yield Citation: PLANTBREEDING (1998) 117: 193-202 Chromosome: 6 Flanking Markers(s): “CSU70,CDO580B” QTL: ZM-GYLD-6-2 Species: Zea mays General Trait: YIELDSpecific Trait: Grain yield Citation: CROP SCI (2000) 40: 30-39Chromosome: 6 Flanking Markers(s): QTL: ZM-GYLD-6-3 Species: Zea maysGeneral Trait: YIELD Specific Trait: Grain yield Citation: CROP SCI(2000) 40: 30-39 Chromosome: 6 Flanking Markers(s): QTL: ZM-GYLD-6-4Species: Zea mays General Trait: YIELD Specific Trait: Grain yieldCitation: CROP SCI (2000) 40: 30-39 Chromosome: 6 Flanking Markers(s):QTL: ZM-GYLD-7-3 Species: Zea mays General Trait: YIELD Specific Trait:Grain yield Citation: CROP SCI (2000) 40: 30-39 Chromosome: 7 FlankingMarkers(s): QTL: ZM-GYLD-8-2 Species: Zea mays General Trait: YIELDSpecific Trait: Grain yield Citation: CROP SCI (2000) 40: 30-39Chromosome: 8 Flanking Markers(s): QTL: ZM-GYLD-9-1 Species: Zea maysGeneral Trait: YIELD Specific Trait: Grain yield Citation: CROP SCI(2000) 40: 30-39 Chromosome: 9 Flanking Markers(s): QTL: ZM-GYLD-9-2Species: Zea mays General Trait: YIELD Specific Trait: Grain yieldCitation: CROP SCI (2000) 40: 30-39 Chromosome: 9 Flanking Markers(s):QTL: ZM-GYUI-9-1 Species: Zea mays General Trait: YIELD Specific Trait:Yield under corn borer infestation Citation: THEOR APPL GENET (2000)101: 907-917 Chromosome: 9 Flanking Markers(s): QTL: ZM-GYUI-9-2Species: Zea mays General Trait: YIELD Specific Trait: Yield under cornborer infestation Citation: THEOR APPL GENET (2000) 101: 907-917Chromosome: 9 Flanking Markers(s): QTL: ZM-GYUP-1-1 Species: Zea maysGeneral Trait: YIELD Specific Trait: Yield under corn borer protectionCitation: THEOR APPL GENET (2000) 101: 907-917 Chromosome: 1 FlankingMarkers(s): QTL: ZM-GYUP-1-2 Species: Zea mays General Trait: YIELDSpecific Trait: Yield under corn borer protection Citation: THEOR APPLGENET (2000) 101: 907-917 Chromosome: 1 Flanking Markers(s): QTL:ZM-GYUP-9-1 Species: Zea mays General Trait: YIELD Specific Trait: Yieldunder corn borer protection Citation: THEOR APPL GENET (2000) 101:907-917 Chromosome: 9 Flanking Markers(s): QTL: ZM-GYUP-9-2 Species: Zeamays General Trait: YIELD Specific Trait: Yield under corn borerprotection Citation: THEOR APPL GENET (2000) 101: 907-917 Chromosome: 9Flanking Markers(s): QTL: ZM-HI-1-1 Species: Zea mays General Trait:YIELD Specific Trait: Harvest index Citation: THEOR APPL GENET (1999)99: 1106-1119 Chromosome: 1 Flanking Markers(s): “UMC94, UMC76” QTL:ZM-HI-1-2 Species: Zea mays General Trait: YIELD Specific Trait: Harvestindex Citation: THEOR APPL GENET (1999) 99: 1106-1119 Chromosome: 1Flanking Markers(s): “UMC163, UMC161” QTL: ZM-HI-10-1 Species: Zea maysGeneral Trait: YIELD Specific Trait: Harvest index Citation: THEOR APPLGENET (1999) 99: 1106-1119 Chromosome: 10 Flanking Markers(s): “UMC146,UMC44” QTL: ZM-HI-3-1 Species: Zea mays General Trait: YIELD SpecificTrait: Harvest index Citation: THEOR APPL GENET (1999) 99: 1106-1119Chromosome: 3 Flanking Markers(s): “UMC92, UMC10” QTL: ZM-HI-4-1Species: Zea mays General Trait: YIELD Specific Trait: Harvest indexCitation: THEOR APPL GENET (1999) 99: 1106-1119 Chromosome: 4 FlankingMarkers(s): “UMC28.1, UMC19” QTL: ZM-HI-7-1 Species: Zea mays GeneralTrait: YIELD Specific Trait: Harvest index Citation: THEOR APPL GENET(1999) 99: 1106-1119 Chromosome: 7 Flanking Markers(s): “BNL15.40,UMC116” QTL: ZM-HI-8-1 Species: Zea mays General Trait: YIELD SpecificTrait: Harvest index Citation: THEOR APPL GENET (1999) 99: 1106-1119Chromosome: 8 Flanking Markers(s): “UMC138L, UMC12” QTL: ZM-ID-10-1Species: Zea mays General Trait: QUALITY Specific Trait: In vitrodigestibility of organic stover Citation: THEOR APPL GENET (2000) 101:907-917 Chromosome: 10 Flanking Markers(s): QTL: ZM-ID-2-1 Species: Zeamays General Trait: QUALITY Specific Trait: In vitro digestibility oforganic stover Citation: THEOR APPL GENET (2000) 101: 907-917Chromosome: 2 Flanking Markers(s): QTL: ZM-ID-5-1 Species: Zea maysGeneral Trait: QUALITY Specific Trait: In vitro digestibility of organicstover Citation: THEOR APPL GENET (2000) 101: 907-917 Chromosome: 5Flanking Markers(s): QTL: ZM-ID-5-2 Species: Zea mays General Trait:QUALITY Specific Trait: In vitro digestibility of organic stoverCitation: THEOR APPL GENET (2000) 101: 907-917 Chromosome: 5 FlankingMarkers(s): QTL: ZM-ID-8-1 Species: Zea mays General Trait: QUALITYSpecific Trait: In vitro digestibility of organic stover Citation: THEORAPPL GENET (2000) 101: 907-917 Chromosome: 8 Flanking Markers(s): QTL:ZM-IVDOM-1-1 Species: Zea mays General Trait: QUALITY Specific Trait: Invitro digestible organic matter Citation: CROP SCI (1998) 38: 1278-1289Chromosome: 1 Flanking Markers(s): UMC76 QTL: ZM-IVDOM-1-2 Species: Zeamays General Trait: QUALITY Specific Trait: In vitro digestible organicmatter Citation: CROP SCI (1998) 38: 1278-1289 Chromosome: 1 FlankingMarkers(s): UMC58 QTL: ZM-IVDOM-1-3 Species: Zea mays General Trait:QUALITY Specific Trait: In vitro digestible organic matter Citation:CROP SCI (1998) 38: 1278-1289 Chromosome: 1 Flanking Markers(s): UMC167QTL: ZM-IVDOM-1-4 Species: Zea mays General Trait: QUALITY SpecificTrait: In vitro digestible organic matter Citation: CROP SCI (1998) 38:1278-1289 Chromosome: 1 Flanking Markers(s): UMC37 QTL: ZM-IVDOM-10-1Species: Zea mays General Trait: QUALITY Specific Trait: In vitrodigestible organic matter Citation: CROP SCI (1998) 38: 1278-1289Chromosome: 10 Flanking Markers(s): UMC130 QTL: ZM-IVDOM-10-2 Species:Zea mays General Trait: QUALITY Specific Trait: In vitro digestibleorganic matter Citation: CROP SCI (1998) 38: 1278-1289 Chromosome: 10Flanking Markers(s): UMC18 QTL: ZM-IVDOM-3-1 Species: Zea mays GeneralTrait: QUALITY Specific Trait: In vitro digestible organic matterCitation: CROP SCI (1998) 38: 1278-1289 Chromosome: 3 FlankingMarkers(s): UMC97 QTL: ZM-IVDOM-3-3 Species: Zea mays General Trait:QUALITY Specific Trait: In vitro digestible organic matter Citation:CROP SCI (1998) 38: 1278-1289 Chromosome: 3 Flanking Markers(s): UMC97QTL: ZM-IVDOM-5-1 Species: Zea mays General Trait: QUALITY SpecificTrait: In vitro digestible organic matter Citation: CROP SCI (1998) 38:1278-1289 Chromosome: 5 Flanking Markers(s): UMC43 QTL: ZM-IVDOM-5-2Species: Zea mays General Trait: QUALITY Specific Trait: In vitrodigestible organic matter Citation: CROP SCI (1998) 38: 1278-1289Chromosome: 5 Flanking Markers(s): BNL7.71 QTL: ZM-IVDOM-5-3 Species:Zea mays General Trait: QUALITY Specific Trait In vitro digestibleorganic matter Citation: CROP SCI (1998) 38: 1278-1289 Chromosome: 5Flanking Markers(s): UMC90 QTL: ZM-IVDOM-9-1 Species: Zea mays GeneralTrait: QUALITY Specific Trait: In vitro digestible organic matterCitation: CROP SCI (1998) 38: 1278-1289 Chromosome: 9 FlankingMarkers(s): BNL5.09 QTL: ZM-IVDOM-9-2 Species: Zea mays General Trait:QUALITY Specific Trait: In vitro digestible organic matter Citation:CROP SCI (1998) 38: 1278-1289 Chromosome: 9 Flanking Markers(s):BNL14.28 QTL: ZM-KNE-4-1 Species: Zea mays General Trait: YIELD SpecificTrait: Kernel number per ear Citation: THEOR APPL GENET (1999) 99:280-288 Chromosome: 4 Flanking Markers(s): PHI093 QTL: ZM-KW100-1-2Species: Zea mays General Trait: YIELD Specific Trait: Kernel weight per100 kernels Citation: THEOR APPL GENET (1999) 99: 1106-1119 Chromosome:1 Flanking Markers(s): “UMC157, BNL8.29” QTL: ZM-KW100-3-1 Species: Zeamays General Trait: YIELD Specific Trait: Kernel weight per 100 kernelsCitation: THEOR APPL GENET (1999) 99: 1106-1119 Chromosome: 3 FlankingMarkers(s): UMC60 QTL: ZM-KW100-9-1 Species: Zea mays General Trait:YIELD Specific Trait: Kernel weight per 100 kernels Citation: THEOR APPLGENET (1999) 99: 1106-1119 Chromosome: 9 Flanking Markers(s): “UMC153,BNL5.09” QTL: ZM-KW300-1-2 Species: Zea mays General Trait: YIELDSpecific Trait: Kernel weight per 300 kernels Citation: CROP SCI (1998)38: 1296-1308 Chromosome: 1 Flanking Markers(s): QTL: ZM-KW300-3-2Species: Zea mays General Trait: YIELD Specific Trait: Kernel weight per300 kernels Citation: CROP SCI (1998) 38: 1296-1308 Chromosome: 3Flanking Markers(s): QTL: ZM-KW300-3-3 Species: Zea mays General Trait:YIELD Specific Trait: Kernel weight per 300 kernels Citation: CROP SCI(1998) 38: 1296-1308 Chromosome: 3 Flanking Markers(s): QTL:ZM-KW300-4-2 Species: Zea mays General Trait: YIELD Specific Trait:Kernel weight per 300 kernels Citation: CROP SCI (1998) 38: 1296-1308Chromosome: 4 Flanking Markers(s): QTL: ZM-KW300-5-1 Species: Zea maysGeneral Trait: YIELD Specific Trait: Kernel weight per 300 kernelsCitation: CROP SCI (1998) 38: 1296-1308 Chromosome: 5 FlankingMarkers(s): QTL: ZM-KW300-6-2 Species: Zea mays General Trait: YIELDSpecific Trait: Kernel weight per 300 kernels Citation: CROP SCI (1998)38: 1296-1308 Chromosome: 6 Flanking Markers(s): QTL: ZM-KW300-8-2Species: Zea mays General Trait: YIELD Specific Trait: Kernel weight per300 kernels Citation: CROP SCI (1998) 38: 1296-1308 Chromosome: 8Flanking Markers(s): QTL: ZM-KW300-9-1 Species: Zea mays General Trait:YIELD Specific Trait: Kernel weight per 300 kernels Citation: CROP SCI(1998) 38: 1296-1308 Chromosome: 9 Flanking Markers(s): QTL:ZM-KW300-9-2 Species: Zea mays General Trait: YIELD Specific Trait:Kernel weight per 300 kernels Citation: CROP SCI (1998) 38: 1296-1308Chromosome: 9 Flanking Markers(s): QTL: ZM-KWE-4-1 Species: Zea maysGeneral Trait: YIELD Specific Trait: Kernel weight per ear Citation:THEOR APPL GENET (1999) 99: 280-288 Chromosome: 4 Flanking Markers(s):PHI093 QTL: ZM-MOIST-1-1 Species: Zea mays General Trait: QUALITYSpecific Trait: Grain moisture Citation: CROP SCI (2000) 40: 30-39Chromosome: 1 Flanking Markers(s): QTL: ZM-MOIST-1-2 Species: Zea maysGeneral Trait: QUALITY Specific Trait: Grain moisture Citation: CROP SCI(2000) 40: 30-39 Chromosome: 1 Flanking Markers(s): QTL: ZM-MOIST-1-3Species: Zea mays General Trait: QUALITY Specific Trait: Grain moistureCitation: CROP SCI (2000) 40: 30-39 Chromosome: 1 Flanking Markers(s):QTL: ZM-MOIST-1-4 Species: Zea mays General Trait: QUALITY SpecificTrait: Grain moisture Citation: CROP SCI (2000) 40: 30-39 Chromosome: 1Flanking Markers(s): QTL: ZM-MOIST-1-5 Species: Zea mays General Trait:QUALITY Specific Trait: Grain moisture Citation: CROP SCI (2000) 40:30-39 Chromosome: 1 Flanking Markers(s): QTL: ZM-MOIST-1-6 Species: Zeamays Genera) Trait: QUALITY Specific Trait: Grain moisture Citation:CROP SCI (2000) 40: 30-39 Chromosome: 1 Flanking Markers(s): QTL:ZM-MOIST-10-1 Species: Zea mays General Trait: QUALITY Specific Trait:Grain moisture Citation: CROP SCI (2000) 40: 30-39 Chromosome: 10Flanking Markers(s): QTL: ZM-MOIST-2-1 Species: Zea mays General Trait:QUALITY Specific Trait: Grain moisture Citation: CROP SCI (2000) 40:30-39 Chromosome: 2 Flanking Markers(s): QTL: ZM-MOIST-2-2 Species: Zeamays General Trait: QUALITY Specific Trait: Grain moisture Citation:CROP SCI (2000) 40: 30-39 Chromosome: 2 Flanking Markers(s): QTL:ZM-MOIST-2-3 Species: Zea mays General Trait: QUALITY Specific Trait:Grain moisture Citation: CROP SCI (2000) 40: 30-39 Chromosome: 2Flanking Markers(s): QTL: ZM-MOIST-3-2 Species: Zea mays General Trait:QUALITY Specific Trait: Grain moisture Citation: CROP SCI (2000) 40:30-39 Chromosome: 3 Flanking Markers(s): QTL: ZM-MOIST-3-3 Species: Zeamays General Trait: QUALITY Specific Trait: Grain moisture Citation:CROP SCI (2000) 40: 30-39 Chromosome: 3 Flanking Markers(s): QTL:ZM-MOIST-4-2 Species: Zea mays General Trait: QUALITY Specific Trait:Grain moisture Citation: CROP SCI (2000) 40: 30-39 Chromosome: 4Flanking Markers(s): QTL: ZM-MOIST-4-3 Species: Zea mays General Trait:QUALITY Specific Trait: Grain moisture Citation: CROP SCI (2000) 40:30-39 Chromosome: 4 Flanking Markers(s): QTL: ZM-MOIST-4-4 Species: Zeamays General Trait: QUALITY Specific Trait: Grain moisture Citation:CROP SCI (2000) 40: 30-39 Chromosome: 4 Flanking Markers(s): QTL:ZM-MOIST-5-1 Species: Zea mays General Trait: QUALITY Specific Trait:Grain moisture Citation: CROP SCI (2000) 40: 30-39 Chromosome: 5Flanking Markers(s): QTL: ZM-MOIST-5-2 Species: Zea mays General Trait:QUALITY Specific Trait: Grain moisture Citation: CROP SCI (2000) 40:30-39 Chromosome: 5 Flanking Markers(s): QTL: ZM-MOIST-5-3 Species: Zeamays General Trait: QUALITY Specific Trait: Grain moisture Citation:CROP SCI (2000) 40: 30-39 Chromosome: 5 Flanking Markers(s): QTL:ZM-MOIST-5-4 Species: Zea mays General Trait: QUALITY Specific Trait:Grain moisture Citation: CROP SCI (2000) 40: 30-39 Chromosome: 5Flanking Markers(s): QTL: ZM-MOIST-6-2 Species: Zea mays General Trait:QUALITY Specific Trait: Grain moisture Citation: CROP SCI (2000) 40:30-39 Chromosome: 6 Flanking Markers(s): QTL: ZM-MOIST-7-1 Species: Zeamays General Trait: QUALITY Specific Trait: Grain moisture Citation:CROP SCI (2000) 40: 30-39 Chromosome: 7 Flanking Markers(s): QTL:ZM-MOIST-7-2 Species: Zea mays General Trait: QUALITY Specific Trait:Grain moisture Citation: CROP SCI (2000) 40: 30-39 Chromosome: 7Flanking Markers(s): QTL: ZM-MOIST-7-3 Species: Zea mays General Trait:QUALITY Specific Trait: Grain moisture Citation: CROP SCI (2000) 40:30-39 Chromosome: 7 Flanking Markers(s): QTL: ZM-MOIST-7-4 Species: Zeamays General Trait: QUALITY Specific Trait: Grain moisture Citation:CROP SCI (2000) 40: 30-39 Chromosome: 7 Flanking Markers(s): QTL:ZM-MOIST-8-1 Species: Zea mays General Trait: QUALITY Specific Trait:Grain moisture Citation: CROP SCI (2000) 40: 30-39 Chromosome: 8Flanking Markers(s): QTL: ZM-MOIST-8-2 Species: Zea mays General Trait:QUALITY Specific Trait: Grain moisture Citation: CROP SCI (2000) 40:30-39 Chromosome: 8 Flanking Markers(s): QTL: ZM-MOIST-9-2 Species: Zeamays General Trait: QUALITY Specific Trait: Grain moisture Citation:CROP SCI (2000) 40: 30-39 Chromosome: 9 Flanking Markers(s): QTL:ZM-MOIST-9-3 Species: Zea mays General Trait: QUALITY Specific Trait:Grain moisture Citation: CROP SCI (2000) 40: 30-39 Chromosome: 9Flanking Markers(s): QTL: ZM-PC-1-1 Species: Zea mays General Trait:QUALITY Specific Trait: Protein concentration Citation: CROP SCI (1998)38: 1062-1072 Chromosome: 1 Flanking Markers(s): “CSU92, CSUCMT11B” QTL:ZM-PC-1-2 Species: Zea mays General Trait: QUALITY Specific Trait:Protein concentration Citation: CROP SCI (1998) 38: 1062-1072Chromosome: 1 Flanking Markers(s): “BNL8.29A, BNL6.32” QTL: ZM-PC-5-1Species: Zea mays General Trait: QUALITY Specific Trait: Proteinconcentration Citation: CROP SCI (1998) 38: 1062-1072 Chromosome: 5Flanking Markers(s): “UMC51A, UMC127B” QTL: ZM-PC-8-1 Species: Zea maysGeneral Trait: QUALITY Specific Trait: Protein concentration Citation:CROP SCI (1998) 38: 1062-1072 Chromosome: 8 Flanking Markers(s):“CSU75D, CDO580A” QTL: ZM-PC-9-1 Species: Zea mays General Trait:QUALITY Specific Trait: Protein concentration Citation: CROP SCI (1998)38: 1062-1072 Chromosome: 9 Flanking Markers(s): “CSU158, CSU147” QTL:ZM-PR-9-1 Species: Zea mays General Trait: QUALITY Specific Trait:Protein content Citation: THEOR APPL GENET (2001) 102: 591-599Chromosome: 9 Flanking Markers(s): QTL: ZM-STC-10-1 Species: Zea maysGeneral Trait: QUALITY Specific Trait: Starch concentration Citation:CROP SCI (1998) 38: 1278-1289 Chromosome: 10 Flanking Markers(s): UMC146QTL: ZM-STC-10-2 Species: Zea mays General Trait: QUALITY SpecificTrait: Starch concentration Citation: CROP SCI (1998) 38: 1278-1289Chromosome: 10 Flanking Markers(s): UMC18 QTL: ZM-STC-2-2 Species: Zeamays General Trait: QUALITY Specific Trait: Starch concentrationCitation: CROP SCI (1998) 38: 1278-1289 Chromosome: 2 FlankingMarkers(s): UMC36 QTL: ZM-STC-5-1 Species: Zea mays General Trait:QUALITY Specific Trait: Starch concentration Citation: CROP SCI (1998)38: 1278-1289 Chromosome: 5 Flanking Markers(s): BNL5.40 QTL: ZM-STC-5-1Species: Zea mays General Trait: QUALITY Specific Trait: Starch contentCitation: CROP SCI (2001) 41: 690-697 Chromosome: 5 Flanking Markers(s):60 QTL: ZM-STC-6-1 Species: Zea mays General Trait: QUALITY SpecificTrait: Starch concentration Citation: CROP SCI (1998) 38: 1278-1289Chromosome: 6 Flanking Markers(s): UMC46 QTL: ZM-STC-7-2 Species: Zeamays General Trait: QUALITY Specific Trait: Starch concentrationCitation: CROP SCI (1998) 38: 1278-1289 Chromosome: 7 FlankingMarkers(s): UMC110 QTL: ZM-STC-8-1 Species: Zea mays General Trait:QUALITY Specific Trait: Starch concentration Citation: CROP SCI (1998)38: 1278-1289 Chromosome: 8 Flanking Markers(s): UMC124 QTL: ZM-STC-8-1Species: Zea mays General Trait: QUALITY Specific Trait: Starch contentCitation: CROP SCI (2001) 41: 690-697 Chromosome: 8 Flanking Markers(s):54 QTL: ZM-TGW-4-1 Species: Zea mays General Trait: YIELD SpecificTrait: Thousand grain weight Citation: THEOR APPL GENET (2001) 102:591-599 Chromosome: 4 Flanking Markers(s): QTL: ZM-TGW-9-1 Species: Zeamays General Trait: YIELD Specific Trait: Thousand grain weightCitation: THEOR APPL GENET (2001) 102: 591-599 Chromosome: 9 FlankingMarkers(s): QTL: ZM-TGW-9-2 Species: Zea mays General Trait: YIELDSpecific Trait: Thousand grain weight Citation: THEOR APPL GENET (2001)102: 591-599 Chromosome: 9 Flanking Markers(s): QTL: ZM-TW-1-1 Species:Zea mays General Trait: YIELD Specific Trait: Test weight Citation:THEOR APPL GENET (2001) 102: 230-243 Chromosome: 1 Flanking Markers(s):QTL: ZM-TW-10-2 Species: Zea mays General Trait: YIELD Specific Trait:Test weight Citation: THEOR APPL GENET (2001) 102: 230-243 Chromosome:10 Flanking Markers(s): QTL: ZM-TW-2-3 Species: Zea mays General Trait:YIELD Specific Trait: Test weight Citation: THEOR APPL GENET (2001) 102:230-243 Chromosome: 2 Flanking Markers(s): QTL: ZM-TW-5-1 Species: Zeamays General Trait: YIELD Specific Trait Test weight Citation: THEORAPPL GENET (2001) 102: 230-243 Chromosome: 5 Flanking Markers(s): QTL:ZM-TW-8-1 Species: Zea mays General Trait: YIELD Specific Trait: Testweight Citation: THEOR APPL GENET (2001) 102: 230-243 Chromosome: 8Flanking Markers(s): QTL: ZM-TW-9-1 Species: Zea mays General Trait:YIELD Specific Trait: Test weight Citation: THEOR APPL GENET (2001) 102:230-243 Chromosome: 9 Flanking Markers(s): QTL: ZM-VT-6-1 Species: Zeamays General Trait: QUALITY Specific Trait: Vitreousness Citation: THEORAPPL GENET (2001) 102: 591-599 Chromosome: 6 Flanking Markers(s): QTL:ZM-YLD-1-1 Species: Zea mays General Trait: YIELD Specific Trait: Grainyield Citation: THEOR APPL GENET (2001) 102: 230-243 Chromosome: 1Flanking Markers(s): QTL: ZM-YLD-2-1 Species: Zea mays General Trait:YIELD Specific Trait: Grain yield Citation: THEOR APPL GENET (2001) 102:230-243 Chromosome: 2 Flanking Markers(s): QTL: ZM-YLD-2-2 Species: Zeamays General Trait: YIELD Specific Trait: Grain yield Citation: THEORAPPL GENET (2001) 102: 230-243 Chromosome: 2 Flanking Markers(s): QTL:ZM-YLD-4-1 Species: Zea mays General Trait: YIELD Specific Trait: Grainyield Citation: THEOR APPL GENET (2001) 102: 230-243 Chromosome: 4Flanking Markers(s): QTL: ZM-YLD-6-1 Species: Zea mays General Trait:YIELD Specific Trait: Grain yield Citation: THEOR APPL GENET (2001) 102:230-243 Chromosome: 6 Flanking Markers(s): QTL: ZM-YLD-9-1 Species: Zeamays General Trait: YIELD Specific Trait: Grain yield Citation: THEORAPPL GENET (2001) 102: 230-243 Chromosome: 9 Flanking Markers(s):

TABLE 15 Swiss-Prot Data 101 Accession: Swissprot_id: Gi_number:Description: BETA-AMYLASE P10538 AMYB_SOYBN 231541 (1,4-ALPHA-D-GLUCANMALTOHYDROLASE) 113 Accession: Swissprot_id: Gi_number: Description:Alpha-glucosidase II Q9F234 AGL2_BACTQ 14423647 1 Accession:Swissprot_id: Gi_number: Description: MPV17 protein P39210 MPV1_HUMAN730059 317 Accession: Swissprot_id: Gi_number: Description: Myb proteinQ08759 MYB_XENLA 730090 329 Accession: Swissprot_id: Gi_number:Description: MATERNAL PUMILIO P25822 PUM_DROME 131605 PROTEIN 173Accession: Swissprot_id: Gi_number: Description: Glucose-6-phosphateP42862 G6PA_ORYSA 1169797 isomerase, cytosolic A (GPI-A) (Phosphoglucoseisomerase A) (PGI-A) (Phosphohexose isomerase A) (PHI-A) 333 Accession:Swissprot_id: Gi_number: Description: ACTIN 1 P02582 ACT1_MAIZE 113220233 Accession: Swissprot_id: Gi_number: Description: GLYCOPROTEIN XP28968 VGLX_HSVEB 138350 PRECURSOR 335 Accession: Swissprot_id:Gi_number: Description: DEVELOPMENTAL PROTEIN Q05201 EYA_DROME 544271EYES ABSENT (PROTEIN CLIFT) 119 Accession: Swissprot_id: Gi_number:Description: Sucrose synthase 2 O24301 SUS2_PEA 3915037 (Sucrose-UDPglucosyltransferase 2) 311 Accession: Swissprot_id: Gi_number:Description: Anthocyanin regulatory P10290 MYBC_MAIZE 127585 C1 protein149 Accession: Swissprot_id: Gi_number: Description:FRUCTOSE-BISPHOSPHATE P17784 ALF_ORYSA 113622 ALDOLASE, CYTOPLASMICISOZYME 155 Accession: Swissprot_id: Gi_number: Description:FRUCTOSE-BISPHOSPHATE Q40677 ALFC_ORYSA 3913018 ALDOLASE, CHLOROPLASTPRECURSOR (ALDP) 143 Accession: Swissprot_id: Gi_number: Description:Triosephosphate isomerase, P46225 TPIC_SECCE 1174745 chloroplastprecursor (TIM) 307 Accession: Swissprot_id: Gi_number: Description:G-box binding factor 4 P42777 GBF4_ARATH 1169863 341 Accession:Swissprot_id: Gi_number: Description: DNA-DIRECTED RNA P16356 RPB1_CAEEL133322 POLYMERASE II LARGEST SUBUNIT 193 Accession: Swissprot_id:Gi_number: Description: MYRISTOYLATED ALANINE-RICH P12624 MACS_BOVIN585447 C-KINASE SUBSTRATE (MARCKS) (ACAMP-81) 131 Accession:Swissprot_id: Gi_number: Description: Soluble glycogen Q43846 UGS4_SOLTU2833389 [starch] synthase, chloroplast precursor (SS III) 199 Accession:Swissprot_id: Gi_number: Description: GLUCOAMYLASE P08640 AMYH_YEAST728850 S1/S2 PRECURSOR (GLUCAN 1,4-ALPHA- GLUCOSIDASE)(1,4-ALPHA-D-GLUCAN GLUCOHYDROLASE) 343 Accession: Swissprot_id:Gi_number: Description: Transacting transcriptional P28284 ICP0_HSV2H124135 protein ICP0 (VMW118 protein) 287 Accession: Swissprot_id:Gi_number: Description: Cell cycle control O59800 CWF5_SCHPO 18202094protein cwf5 191 Accession: Swissprot_id: Gi_number: Description:Endo-1,3; 1,4-beta-D- Q9ZT66 E134_MAIZE 8928122 glucanase precursor 215Accession: Swissprot_id: Gi_number: Description: GLUTELIN P07730GLU2_ORYSA 121475 TYPE II PRECURSOR 23 Accession: Swissprot_id:Gi_number: Description: Speckle-type O43791 SPOP_HUMAN 8134708 POZprotein 147 Accession: Swissprot_id: Gi_number: Description:Triosephosphate P48494 TPIS_ORYSA 1351270 isomerase, cytosolic (TIM) 347Accession: Swissprot_id: Gi_number: Description: FRUCTOKINASE P37829SCRK_SOLTU 585973 157 Accession: Swissprot_id: Gi_number: Description:Phosphoglycolate P32662 GPH_ECOLI 418445 phosphatase (PGP) 349Accession: Swissprot_id: Gi_number: Description: CALPHOTIN Q02910CPN_DROME 416833 139 Accession: Swissprot_id: Gi_number: Description:Glucose-1-phosphate P12299 GLG2_WHEAT 1707930 adenylyltransferase largesubunit, chloroplast precursor (ADP-glucose synthase) (ADP-glucosepyrophosphorylase) (AGPASE S) (Alpha-D-glucose-1-phosphate adenyltransferase) 175 Accession: Swissprot_id: Gi_number: Description: Triosephosphate/phosphate P52178 CPT2_BRAOL 1706110 translocator, non-greenplastid, chloroplast precursor (CTPT) 5 Accession: Swissprot_id:Gi_number: Description: Peroxidase P7 P00434 PERX_BRARA 464365 351Accession: Swissprot_id: Gi_number: Description: ZINC FINGER PROTEINP38682 GLO3_YEAST 729595 GLO3 353 Accession: Swissprot_id: Gi_number:Description: FRUCTOKINASE P37829 SCRK_SOLTU 585973 255 Accession:Swissprot_id: Gi_number: Description: MUCIN 2 PRECURSOR Q02817MUC2_HUMAN 2506877 (INTESTINAL MUCIN 2) 75 Accession: Swissprot_id:Gi_number: Description: Pullulanase precursor P07206 PULA_KLEPN 131589(Alpha-dextrin endo-1,6-alpha- glucosidase) (Pullulan 6-glucanohydrolase) 357 Accession: Swissprot_id: Gi_number: Description:IMMEDIATE-EARLY P33479 IE18_PRVKA 462387 PROTEIN IE 180 359 Accession:Swissprot_id: Gi_number: Description: LINE-1 P08547 LIN1_HUMAN 126295REVERSE TRANSCRIPTASE HOMOLOG 361 Accession: Swissprot_id: Gi_number:Description: EBNA-1 NUCLEAR P03211 EBN1_EBV 119110 PROTEIN 363Accession: Swissprot_id: Gi_number: Description: MUCIN 2 PRECURSORQ02817 MUC2_HUMAN 2506877 (INTESTINAL MUCIN 2) 365 Accession:Swissprot_id: Gi_number: Description: LINE-1 REVERSE P08548 LIN1_NYCCO126296 TRANSCRIPTASE HOMOLOG 181 Accession: Swissprot_id: Gi_number:Description: PYROPHOSPHATE--FRUCTOSE Q41140 PFPA_RICCO 24994886-PHOSPHATE 1-PHOSPHOTRANSFERASE ALPHA SUBUNIT (PFP) (6-PHOS-PHOFRUCTOKINASE (PYROPHOSPHATE)) (PYROPHOSPHATE-DEPENDENT 6-PHOSPHOFRUCTOSE-1-KINASE) (PPI-PFK) 367 Accession: Swissprot_id:Gi_number: Description: RETINAL DEGENERATION B P43125 RDGB_DROME 1172875PROTEIN (PROBABLE CALCIUM TRANSPORTER RDGB) 261 Accession: Swissprot_id:Gi_number: Description: 3-DEOXY-MANNO- Q59320 KDSB_CHLTR 7387818OCTULOSONATE CYTIDYLYLTRANSFERASE (CMP-KDO SYNTHETASE) (CMP-2-KETO-3-DEOXYOCTULOSONIC ACID SYNTHETASE) (CKS) 221 Accession: Swissprot_id:Gi_number: Description: CYSTATHIONINE GAMMA- P55217 METB_ARATH 2507422SYNTHASE, CHLOROPLAST PRECURSOR (CGS) (O-SUCCINYLHOMOSERINE (THIOL)-LYASE) 57 Accession: Swissprot_id: Gi_number: Description:ARABINOSE-PROTON P09830 ARAE_ECOLI 114102 SYMPORTER (ARABINOSETRANSPORTER) 25 Accession: Swissprot_id: Gi_number: Description:RECEPTOR PROTEIN Q9SYQ8 CLV1_ARATH 12643323 KINASE CLAVATA1 PRECURSOR369 Accession: Swissprot_id: Gi_number: Description: REGULATORY PROTEINE2 P06921 VE2_HPV05 1352839 39 Accession: Swissprot_id: Gi_number:Description: LEUCINE-RICH REPEAT Q9UQ13 SHO2_HUMAN 14423936 PROTEINSHOC-2 (RAS-BINDING PROTEIN SUR-8) 87 Accession: Swissprot_id:Gi_number: Description: Alpha-amylase isozyme P27935 AM2A_ORYSA 1136782A precursor (1,4-alpha-D-glucan glucanohydrolase) 371 Accession:Swissprot_id: Gi_number: Description: EARLY NODULIN 93 Q02921 NO93_SOYBN730165 (N-93) 163 Accession: Swissprot_id: Gi_number: Description:Triose phosphate/phosphate P52178 CPT2_BRAOL 1706110 translocator,non-green plastid, chloroplast precursor (CTPT) 375 Accession:Swissprot_id: Gi_number: Description: BEM46 PROTEIN P54069 BE46_SCHPO12644312 315 Accession: Swissprot_id: Gi_number: Description:Myb-related protein P20025 MYB3_MAIZE 127582 Zm38 89 Accession:Swissprot_id: Gi_number: Description: ALPHA-AMYLASE ISOZYME P27934AM3E_ORYSA 113683 3E PRECURSOR (1,4-ALPHA-D-GLUCAN GLUCANOHYDROLASE) 289Accession: Swissprot_id: Gi_number: Description: ASPARTATEAMINOTRANSFERASE, P37833 AATC_ORYSA 584706 CYTOPLASMIC (TRANSAMINASE A)49 Accession: Swissprot_id: Gi_number: Description: SUGAR CARRIER Q41144STC_RICCO 3915039 PROTEIN C 153 Accession: Swissprot_id: Gi_number:Description: TRIOSE PHOSPHATE/ P21727 CPTR_PEA 117290 PHOSPHATETRANSLOCATOR, CHLOROPLAST PRECURSOR (CTPT) (P36) (E30) 81 Accession:Swissprot_id: Gi_number: Description: ALPHA-AMYLASE P17654 AMY1_ORYSA113766 PRECURSOR (1,4-ALPHA-D-GLUCAN GLUCANOHYDROLASE) (ISOZYME 1B) 379Accession: Swissprot_id: Gi_number: Description: WISKOTT-ALDRICH O43516WAIP_HUMAN 13124642 SYNDROME PROTEIN INTERACTING PROTEIN (WASPINTERACTING PROTEIN) (PRPL-2 PROTEIN) 305 Accession: Swissprot_id:Gi_number: Description: TRANSCRIPTIONAL Q02516 HAP5_YEAST 2493550ACTIVATOR HAP5 381 Accession: Swissprot_id: Gi_number: Description:Retrovirus-related P10978 POLX_TOBAC 130582 Pol polyprotein fromtransposon TNT 1-94 [Contains: Protease; Reverse transcriptase;Endonuclease] 197 Accession: Swissprot_id: Gi_number: Description:Alpha-amylase/trypsin P01087 IAAT_ELECO 2851515 inhibitor (RBI) (RATI)45 Accession: Swissprot_id: Gi_number: Description: Organic cation/O76082 OCN2_HUMAN 8928257 carnitine transporter 2 (Solute carrier family22, member 5) (High-affinity sodium-dependent carnitine cotransporter)97 Accession: Swissprot_id: Gi_number: Description: SEED ALLERGENICQ01885 RAG2_ORYSA 548671 PROTEIN RAG2 PRECURSOR 383 Accession:Swissprot_id: Gi_number: Description: RING FINGER PROTEIN Q9WTV7RNFB_MOUSE 13124535 12 (LIM DOMAIN INTERACTING RING FINGER PROTEIN)(RING FINGER LIM DOMAIN-BINDING PROTEIN) (R-LIM) 135 Accession:Swissprot_id: Gi_number: Description: Glucose-1-phosphate P55241GLG1_MAIZE 1707924 adenylyltransferase large subunit 1, chloroplastprecursor (ADP-glucose synthase) (ADP-glucose pyrophosphorylase) (AGPASES) (Alpha-D-glucose-1-phosphate adenyl transferase) (Shrunken-2) 267Accession: Swissprot_id: Gi_number: Description: PROLINE-RICH P05143PRP3_MOUSE 131002 PROTEIN MP-3 385 Accession: Swissprot_id: Gi_number:Description: CYSTEINE PROTEINASE Q10993 CYTB_HELAN 1706277 INHIBITOR B(CYSTATIN B) (SCB) 283 Accession: Swissprot_id: Gi_number: Description:Glycine-rich RNA- P49311 GRP2_SINAL 1346181 binding protein GRP2A 53Accession: Swissprot_id: Gi_number: Description: CATION TRANSPORT P39163CHAC_ECOLI 12644253 PROTEIN CHAC 253 Accession: Swissprot_id: Gi_number:Description: Tetraacyldisaccharide Q9KQX0 LPXK_VIBCH 14423750 4′-kinase(Lipid A 4′-kinase) 295- Accession: Swissprot_id: Gi_number:Description: S-adenosylmethionine: Q9I2W7 MENG_PSEAE 173690152-demethylmenaquinone methyl- transferase 389 Accession: Swissprot_id:Gi_number: Description: Extensin precursor P13983 EXTN_TOBAC 119714(Cell wall hydroxyproline-rich glycoprotein) 225 Accession:Swissprot_id: Gi_number: Description: GLUTELIN PRECURSOR P14323GLU4_ORYSA 121476 391 Accession: Swissprot_id: Gi_number: Description:GAMMA-GLIADIN P08453 GDB2_WHEAT 121101 PRECURSOR 167 Accession:Swissprot_id: Gi_number: Description: Fructose-2,6- P32604 F26_YEAST1169587 bisphosphatase 137 Accession: Swissprot_id: Gi_number:Description: Glucose-1-phosphate P55238 GLGS_HORVU 1707940adenylyltransferase small subunit, chloroplast precursor (ADP-glucosesynthase) (ADP-glucose pyrophosphorylase) (AGPASE B)(Alpha-D-glucose-1-phosphate adenyl transferase) 195 Accession:Swissprot_id: Gi_number: Description: MUCIN 2 PRECURSOR Q02817MUC2_HUMAN 2506877 (INTESTINAL MUCIN 2) 263 Accession: Swissprot_id:Gi_number: Description: ACYL-COA-BINDING O22643 ACBP_FRIAG 5902717PROTEIN (ACBP) 223 Accession: Swissprot_id: Gi_number: Description:GLUTELIN TYPE I P07728 GU11_ORYSA 121469 PRECURSOR (CLONE PREE 61) 85Accession: Swissprot_id: Gi_number: Description: ALPHA-AMYLASE ISOZYMEP27937 AM3B_ORYSA 113680 3B PRECURSOR (1,4-ALPHA-D-GLUCANGLUCANOHYDROLASE) 129 Accession: Swissprot_id: Gi_number: Description:Glycogen [starch] Q43093 UGS3_PEA 2833384 synthase, chloroplastprecursor (GBSSII) (Granule-bound starch synthase II) 103 Accession:Swissprot_id: Gi_number: Description: BETA-AMYLASE (1,4- P93594AMYB_WHEAT 3334120 ALPHA-D-GLUCAN MALTOHYDROLASE) 51 Accession:Swissprot_id: Gi_number: Description: Peptide transporter P46032PT2B_ARATH 1172704 PTR2-B (Histidine transporting protein) 99 Accession:Swissprot_id: Gi_number: Description: SEED ALLERGENIC Q01885 RAG2_ORYSA548671 PROTEIN RAG2 PRECURSOR 69 Accession: Swissprot_id: Gi_number:Description: 1,4-ALPHA-GLUCAN Q01401 GLGB_ORYSA 399544 BRANCHING ENZYME(STARCH BRANCHING ENZYME) (Q-ENZYME) 229 Accession: Swissprot_id:Gi_number: Description: GLUTELIN TYPE II P07730 GLU2_ORYSA 121475PRECURSOR 241 Accession: Swissprot_id: Gi_number: Description: 10 KDPROLAMIN P15839 PRO1_ORYSA 130946 PRECURSOR 91 Accession: Swissprot_id:Gi_number: Description: ALPHA-AMYLASE PRECURSOR P17654 AMY1_ORYSA 113766(1,4-ALPHA-D-GLUCAN GLUCANOHYDROLASE) (ISOZYME 1B) 401 Accession:Swissprot_id: Gi_number: Description: GLUTELIN PRECURSOR P14323GLU4_ORYSA 121476 121 Accession: Swissprot_id: Gi_number: Description:Sucrose synthase 2 P31924 SUS2_ORYSA 401140 (Sucrose-UDPglucosyltransferase 2) 403 Accession: Swissprot_id: Gi_number:Description: Mago O65806 MGN_EUPLA 6016561 nashi protein homolog 187Accession: Swissprot_id: Gi_number: Description: FRUCTOSE-1,6- O64422F16P_ORYSA 3913641 BISPHOSPHATASE, CHLOROPLAST PRECURSOR(D-FRUCTOSE-1,6- BISPHOSPHATE 1-PHOSPHOHYDROLASE) (FBPASE) 13 Accession:Swissprot_id: Gi_number: Description: Blue copper protein Q41001 BCP_PEA2493318 precursor 243 Accession: Swissprot_id: Gi_number: Description:PROLAMIN PPROL 17 P20698 PRO7_ORYSA 130959 PRECURSOR 203 Accession:Swissprot_id: Gi_number: Description: Glycogen operon Q10767 GLGX_MYCTU1707945 protein glgX homolog 407 Accession: Swissprot_id: Gi_number:Description: Vegetatible Q00808 HET1_PODAN 3023956 incompatibilityprotein HET-E-1 409 Accession: Swissprot_id: Gi_number: Description:O-METHYLTRANSFERASE P47917 ZRP4_MAIZE 1353193 ZRP4 (OMT) 411 Accession:Swissprot_id: Gi_number: Description: GLUCOAMYLASE P08640 AMYH_YEAST728850 S1/S2 PRECURSOR (GLUCAN 1,4-ALPHA- GLUCOSIDASE)(1,4-ALPHA-D-GLUCAN GLUCOHYDROLASE) 105 Accession: Swissprot_id:Gi_number: Description: BETA-AMYLASE (1,4- P55005 AMYB_MAIZE 1703302ALPHA-D-GLUCAN MALTOHYDROLASE) 107 Accession: Swissprot_id: Gi_number:Description: BETA-AMYLASE (1,4- P10538 AMYB_SOYBN 231541 ALPHA-D-GLUCANMALTOHYDROLASE) 115 Accession: Swissprot_id: Gi_number: Description:ALPHA-GLUCOSIDASE Q43763 AGLU_HORVU 3023275 PRECURSOR (MALTASE) 15Accession: Swissprot_id: Gi_number: Description: DnaJ homolog subfamilyP25685 DJB1_HUMAN 1706473 B member 1 (Heat shock 40 kDa protein 1) (Heatshock protein 40) (HSP40) (DnaJ protein homolog 1) (HDJ-1) 165Accession: Swissprot_id: Gi_number: Description: Alpha-1,4 glucan P27598PHSL_IPOBA 130172 phosphorylase, L isozyme, chloroplast precursor(Starch phosphorylase L) 123 Accession: Swissprot_id: Gi_number:Description: Sucrose synthase 3 Q43009 SUS3_ORYSA 3915054 (Sucrose-UDPglucosyltransferase 3) 205 Accession: Swissprot_id: Gi_number:Description: MUCIN 2 PRECURSOR Q02817 MUC2_HUMAN 2506877 (INTESTINALMUCIN 2) 413 Accession: Swissprot_id: Gi_number: Description:ANTER-SPECIFIC PROLINE- P40603 APG_BRANA 728868 RICH PROTEIN APG(PROTEIN CEX) 209 Accession: Swissprot_id: Gi_number: Description:ANTHOCYANIN REGULATORY P13526 ARLC_MAIZE 114156 LC PROTEIN 323Accession: Swissprot_id: Gi_number: Description: Wiskott-Aldrichsyndrome P70315 WASP_MOUSE 2499130 protein homolog (WASP) 415 Accession:Swissprot_id: Gi_number: Description: SPIDROIN 1 P19837 SPD1_NEPCL1174414 (DRAGLINE SILK FIBROIN 1) 141 Accession: Swissprot_id:Gi_number: Description: Glucose-1-phosphate P55238 GLGS_HORVU 1707940adenylyltransferase small subunit, chloroplast precursor (ADP-glucosesynthase) (ADP-glucose pyrophosphorylase)(AGPASE B)(Alpha-D-glucose-1-phosphate adenyl transferase) 27 Accession:Swissprot_id: Gi_number: Description: Carbon catabolite Q02723RKI1_SECCE 400982 derepressing protein kinase 65 Accession:Swissprot_id: Gi_number: Description: PHOSPHATE-REPRESSIBLE P15710PHO4_NEUCR 130117 PHOSPHATE PERMEASE 185 Accession: Swissprot_id:Gi_number: Description: PYROPHOSPHATE-- Q41140 PFPA_RICCO 2499488FRUCTOSE 6-PHOSPHATE 1- PHOSPHOTRANSFERASE ALPHA SUBUNIT (PFP)(6-PHOSPHOFRUCTOKINASE (PYROPHOSPHATE)) (PYROPHOSPHATE- DEPENDENT6-PHOSPHOFRUCTOSE-1- KINASE) (PPI-PFK) 299 Accession: Swissprot_id:Gi_number: Description: Heterogeneous nuclear P09651 ROA1_HUMAN 133254ribonucleoprotein A1 (Helix- destabilizing protein) (Single- strandbinding protein) (hnRNP core protein A1) 67 Accession: Swissprot_id:Gi_number: Description: Peptide transporter P46032 PT2B_ARATH 1172704PTR2-B (Histidine transporting protein) 17 Accession: Swissprot_id:Gi_number: Description: Stromal 70 kDa heat Q02028 HS7S_PEA 399942shock-related protein, chloroplast precursor 279 Accession:Swissprot_id: Gi_number: Description: Probable P38994 MSS4_YEAST 1709144phosphatidylinositol-4-phosphate 5- kinase MSS4 (1-phosphatidylinositol-4-phosphate kinase) (PIP5K) (PtdIns(4)P-5-kinase) (Diphosphoinositidekinase) 71 Accession: Swissprot_id: Gi_number: Description:1,4-alpha-glucan Q08047 GLGB_MAIZE 1169911 branching enzyme IIB,chloroplast precursor (Starch branching enzyme IIB) (Q-enzyme) 207Accession: Swissprot_id: Gi_number: Description: Indole-3-glycerolP49572 TRPC_ARATH 1351303 phosphate synthase, chloroplast precursor(IGPS) 417 Accession: Swissprot_id: Gi_number: Description: Transactingtran- P28284 ICP0_HSV2H 124135 scriptional protein ICP0 (VMW118 protein)127 Accession: Swissprot_id: Gi_number: Description: Sucrose synthase 2O24301 SUS2_PEA 3915037 (Sucrose-UDP glucosyltransferase 2) 125Accession: Swissprot_id: Gi_number: Description: Sucrose synthase 2O24301 SUS2_PEA 3915037 (Sucrose-UDP glucosyltransferase 2) 183Accession: Swissprot_id: Gi_number: Description: Pyrophosphate-- Q59126PFP_AMYME 3122594 fructose 6-phosphate 1-phospho- transferase(6-phosphofructokinase (Pyrophosphate)) (Pyrophosphate- dependent6-phosphofructose-1- kinase) (PPI-PFK) 419 Accession: Swissprot_id:Gi_number: Description: GLUTELIN TYPE-B Q02897 GLUC_ORYSA 544400 2PRECURSOR 421 Accession: Swissprot_id: Gi_number: Description: Goliathprotein Q06003 GOLI_DROME 462193 (G1 protein) 29 Accession:Swissprot_id: Gi_number: Description: Calcium-dependent P53682CDP1_ORYSA 1705733 protein kinase, isoform 1 (CDPK 1) 297 Accession:Swissprot_id: Gi_number: Description: MATERNAL PUMILIO P25822 PUM_DROME131605 PROTEIN 245 Accession: Swissprot_id: Gi_number: Description:IMMUNOGLOBULIN A1 P45386 IGA4_HAEIN 1170517 PROTEASE PRECURSOR (IGA1PROTEASE) 427 Accession: Swissprot_id: Gi_number: Description:RETROTRANSPOSABLE Q05654 RDPO_SCHPO 1710054 ELEMENT TF2 155 KDA PROTEIN159/171 Accession: Swissprot_id: Gi_number: Description:Glucose-6-phosphate X P42862 G6PA_ORYSA 1169797 isomerase, cytosolic A(GPI-A) (Phosphoglucose isomerase A) (PGI-A) (Phosphohexose isomerase A)(PHI-A) 31 Accession: Swissprot_id: Gi_number: Description: Peptidetransporter P46032 PT2B_ARATH 1172704 PTR2-B (Histidine transportingprotein) 403/431- Accession: Swissprot_id: Gi_number: Description:VITELLOGENIN II P02845 VIT2_CHICK 138595 PRECURSOR (MAJOR VITELLOGENIN)[CONTAINS: LIPOVITELLIN I (LVI); PHOSVITIN (PV); LIPOVITELLIN II (LVII);YGP40] 275 Accession: Swissprot_id: Gi_number: Description:TRANSCRIPTIONAL P15276 ALGP_PSEAE 13959675 REGULATORY PROTEIN ALGP(ALGINATE REGULATORY PROTEIN ALGR3) 19 Accession: Swissprot_id:Gi_number: Description: Protein phosphatase 2C O62830 P2CB_BOVIN10720178 beta isoform (PP2C-beta) 151 Accession: Swissprot_id:Gi_number: Description: Alpha-glucan phos- Q9LKJ3 PHSH_WHEAT 14916632phorylase, H isozyme (Starch phosphorylase H) 213/227- Accession:Swissprot_id: Gi_number: Description: 19 KD GLOBULIN P29835 GL19_ORYSA232161 PRECURSOR (ALPHA-GLOBULIN) 237 Accession: Swissprot_id:Gi_number: Description: CALMODULIN P02595 CALM_PATSP 115518 133Accession: Swissprot_id: Gi_number: Description: Granule-bound glycogenQ42968 UGST_ORYGL 2833382 [starch] synthase, chloroplast precursor 239Accession: Swissprot_id: Gi_number: Description: GLUTELIN TYPE-A Q09151GLU3_ORYSA 1707986 III PRECURSOR 161 Accession: Swissprot_id: Gi_number:Description: UTP--GLUCOSE-1- Q43772 UDPG_HORVU 6136111 PHOSPHATEURIDYLYLTRANSFERASE (UDP-GLUCOSE PYROPHOSPHORYLASE) (UDPGP) (UGPASE) 61Accession: Swissprot_id: Gi_number: Description: Intestinal P70545NDC2_RAT 2499525 sodium/dicarboxylate cotransporter Na(+)/dicarboxylatecotransporter) 47 Accession: Swissprot_id: Gi_number: Description:INORGANIC PHOSPHATE P25297 PH84_YEAST 1346710 TRANSPORTER PHO84 219Accession: Swissprot_id: Gi_number: Description: PROLAMIN PPROL 17P20698 PRO7_ORYSA 130959 PRECURSOR 435 Accession: Swissprot_id:Gi_number: Description: SEED ALLERGENIC Q01881 RA05_ORYSA 548657 PROTEINRA5 PRECURSOR 259/271- Accession: Swissprot_id: Gi_number: Description:OLEOSIN 16 KD Q42980 OLE1_ORYSA 3334280 (OSE701) 93 Accession:Swissprot_id: Gi_number: Description: PROTEIN KINASE APK1B P46573APKB_ARATH 12644274 441 Accession: Swissprot_id: Gi_number: Description:Luminal binding protein Q03685 BIP5_TOBAC 729623 5 precursor (BiP 5) (78kDa glucose- regulated protein homolog 5) (GRP 78-5) 111 Accession:Swissprot_id: Gi_number: Description: ATP-binding cassette, Q99758ABC3_HUMAN 7387524 subfamily A, member 3 (ATP-binding cassettetransporter 3) (ATP-binding cassette 3) (ABC-C transporter) 73Accession: Swissprot_id: Gi_number: Description: 1,4-alpha-glucan Q08047GLGB_MAIZE 1169911 branching enzyme IIB, chloroplast precursor (Starchbranching enzyme IIB) (Q-enzyme) 443 Accession: Swissprot_id: Gi_number:Description: Luminal binding protein Q03685 BIP5_TOBAC 729623 5precursor (BiP 5) (78 kDa glucose- regulated protein homolog 5) (GRP78-5) 235 Accession: Swissprot_id: Gi_number: Description: P14614GLU5_ORYSA 121477 GLUTELIN PRECURSOR 217 Accession: Swissprot_id:Gi_number: Description: 13 KD PROLAMIN PRECURSOR P17048 PRO2_ORYSA6174927 257 Accession: Swissprot_id: Gi_number: Description: OLEOSIN 18KD (OSE721) Q40646 OLE2_ORYSA 3334279 201 Accession: Swissprot_id:Gi_number: Description: Receptor-like protein P47735 RLK5_ARATH 1350783kinase 5 precursor 445 Accession: Swissprot_id: Gi_number: Description:SULFATED SURFACE P21997 SSGP_VOLCA 134920 GLYCOPROTEIN 185 (SSG 185) 281Accession: Swissprot_id: Gi_number: Description: SACCHAROPINE P38999LYS9_YEAST 729968 DEHYDROGENASE [NADP+, L-GLUTAMATE FORMING] 251Accession: Swissprot_id: Gi_number: Description: Cyclic-nucleotide-gatedQ00195 CNG2_RAT 116574 olfactory channel (Cyclic-nucleotide- gatedcation channel 2) (CNG channel 2) (CNG2) (CNG-2) (OCNC1) 3 Accession:Swissprot_id: Gi_number: Description: Receptor-like protein P47735RLK5_ARATH 1350783 kinase 5 precursor 447 Accession: Swissprot_id:Gi_number: Description: Peroxisome assembly O60683 PEXA_HUMAN 3914299protein 10 (Peroxin-10) 21 Accession: Swissprot_id: Gi_number:Description: PROTEIN KINASE APK1B P46573 APKB_ARATH 12644274 179Accession: Swissprot_id: Gi_number: Description: Prophosphate-- P21343PFPB_SOLTU 2507174 fructose 6-phosphate 1-phosphotransferase betasubunit (PFP) (6-phosphofructokinase (Pyrophosphate)) (Pyrophosphate-dependent 6-phosphofructose-1- kinase) (PPI-PFK) 319 Accession:Swissprot_id: Gi_number: Description: GLYCERALDEHYDE Q64467 G3PT_MOUSE2494630 3-PHOSPHATE DEHYDROGENASE, ESTIS-SPECIFIC (GAPDH) 7 Accession:Swissprot_id: Gi_number: Description: Probable protease P20346P322_SOLTU 129350 inhibitor P322 precursor 291 Accession: Swissprot_id:Gi_number: Description: Neural Wiskott-Aldrich O08816 WASL_RAT 13431956syndrome protein (N-WASP) 169 Accession: Swissprot_id: Gi_number:Description: UTP--glucose-1-phosphate O64459 UDPG_PYRPY 6136112uridylyltransferase (UDP-glucose pyrophosphorylase) (UDPGP) (UGPase) 83Accession: Swissprot_id: Gi_number: Description: ALPHA-AMYLASE ISOZYMEP27933 AM3D_ORYSA 113682 3D PRECURSOR (1,4-ALPHA-D-GLUCANGLUCANOHYDROLASE) 269 Accession: Swissprot_id: Gi_number: Description:PHOSPHOLIPASE D2 O14939 PLD2_HUMAN 13124441 (PLD 2) CHOLINE PHOSPHATASE2) PHOSPHATIDYLCHOLINE-HYDROLYZING PHOSPHOLIPASE D2) (PLD1C) 95Accession: Swissprot_id: Gi_number: Description: SEED ALLERGEN1C Q01885RAG2_ORYSA 548671 PROTEIN RAG2 PRECURSOR 9 Accession: Swissprot_id:Gi_number: Description: Eukaryotic initiation Q03387 IF41_WHEAT 1170504factor (iso)4F subunit P82-34 (eIF-(iso)4F P82-34) 449 Accession:Swissprot_id: Gi_number: Description: Palmitoyl-protein P50897 PPT_HUMAN1709747 thioesterase precursor (Palmitoyl- protein hydrolase) 451Accession: Swissprot_id: Gi_number: Description: Cell wall protein DAN4P47179 DAN4_YEAST 1352944 precursor 277 Accession: Swissprot_id:Gi_number: Description: MUCIN 2 PRECURSOR Q02817 MUC2_HUMAN 2506877(INTESTINAL MUCIN 2) 285 Accession: Swissprot_id: Gi_number:Description: MATERNAL PUMILIO PROTEIN P25822 PUM_DROME 131605 453Accession: Swissprot_id: Gi_number: Description: REGULATORY PROTEIN E2P06921 VE2_HPV05 1352839 265 Accession: Swissprot_id: Gi_number:Description: ANTER-SPECIFIC PROLINE- P40602 APG_ARATH 728867 RICHPROTEIN APG PRECURSOR 327 Accession: Swissprot_id: Gi_number:Description: Myb protein Q08759 MYB_XENLA 730090 231 Accession:Swissprot_id: Gi_number: Description: CALMODULIN-RELATED P27164CAL3_PETHY 115492 PROTEIN 37 Accession: Swissprot_id: Gi_number:Description: Peptide transporter P46032 PT2B_ARATH 1172704 PTR2-B(Histidine transporting protein) 455 Accession: Swissprot_id: Gi_number:Description: VITELLOGENIN II P02845 VIT2_CHICK 138595 PRECURSOR (MAJORVITELLOGENIN) [CONTAINS: LIPOVITELLIN I (LVI); PHOSVITIN (PV);LIPOVITELLIN II (LVII); YGP40] 43 Accession: Swissprot_id: Gi_number:Description: MLO PROTEIN P93766 MLO_HORVU 6016588 457 Accession:Swissprot_id: Gi_number: Description: VACUOLAR PROTEIN Q07878 VP13_YEAST2499125 SORTING-ASSOCIATED PROTEIN VPS 13 459 Accession: Swissprot_id:Gi_number: Description: Protein-export Q50634 SECD_MYCTU 2498898membrane protein secD 293 Accession: Swissprot_id: Gi_number:Description: Minor extracellular P29141 SUBV_BACSU 135023 protease VPRprecursor 321 Accession: Swissprot_id: Gi_number: Description: Mybproto-oncogene P01103 MYB_CHICK 127591 protein (C-myb) 79 Accession:Swissprot_id: Gi_number: Description: GLUCOAMYLASE P08640 AMYH_YEAST728850 S1/S2 PRECURSOR (GLUCAN 1,4-ALPHA- GLUCOSIDASE)(1,4-ALPHA-D-GLUCAN GLUCOHYDROLASE) 211 Accession: Swissprot_id:Gi_number: Description: GAMMA-GLIADIN P08079 GDB0_WHEAT 121099 PRECURSOR177 Accession: Swissprot_id: Gi_number: Description:FRUCTOSE-BISPHOSPHATE P46256 ALF1_PEA 1168408 ALDOLASE, CYTOPLASMICISOZYME 1 461 Accession: Swissprot_id: Gi_number: Description: MUCIN 2PRECURSOR Q02817 MUC2_HUMAN 2506877 (INTESTINAL MUCIN 2)

All publications, patents and patent applications are incorporatedherein by reference. While in the foregoing specification this inventionhas been described in relation to certain preferred embodiments thereof,and many details have been set forth for purposes of illustration, itwill be apparent to those skilled in the art that the invention issusceptible to additional embodiments and that certain of the detailsdescribed herein may be varied considerably without departing from thebasic principles of the invention.

1. A polynucleotide comprising a nucleotide sequence encoding apolypeptide the activity of which is involved in or associated with thesynthesis, metabolism or degradation of carbohydrates in the plant grainand the expression of which is up-regulated during grain filling, whichnucleotide sequence is substantially similar to a sequence encoding apolypeptide as given in SEQ ID NOS: 70-210 or a partial-lengthpolypeptide having substantially the same activity as the full-lengthpolypeptide, e.g., at least 50%, more preferably at least 80%, even morepreferably at least 90% to 95% the activity of the full-lengthpolypeptide.
 2. The polynucleotide of claim 1 comprising a nucleotidesequence a) as given in any one of SEQ ID NOs: 69-209 or a part thereofwhich still encodes a partial-length polypeptide having substantiallythe same activity as the full-length polypeptide, e.g., at least 50%,more preferably at least 80%, even more preferably at least 90% to 95%the activity of the full-length polypeptide; b) having substantialsimilarity to (a); c) capable of hybridizing to (a) or the complementthereof; d) capable of hybridizing to a nucleic acid comprising 50 to200 or more consecutive nucleotides of a nucleotide sequence given inSEQ ID NO: 69-209, or the complement thereof; e) complementary to (a),(b) or (c); and f) which is the reverse complement of (a), (b) or (c).3. A polynucleotide according to claim 1 comprising a nucleotidesequence encoding a polypeptide which is involved in associated withstarch biosynthsis and up-regulated during grain filling, which nucleicacid molecule is substantially similar to a nucleic acid encoding apolypeptide as given in SEQ ID NOs: 70-188 or a partial-lengthpolypeptide having substantially the same activity as the full-lengthpolypeptide, e.g., at least 50%, more preferably at least 80%, even morepreferably at least 90% to 95% the activity of the full-lengthpolypeptide.
 4. The polynucleotide of claim 3 comprising a nucleotidesequence a) as given in any one of the SEQ ID NOs of table 7 such as SEQID NOs: 69-187 or a part thereof which still encodes a partial-lengthpolypeptide having substantially the same activity as the full-lengthpolypeptide, e.g., at least 50%, more preferably at least 80%, even morepreferably at least 90% to 95% the activity of the full-lengthpolypeptide; b) having substantial similarity to (a); c) capable ofhybridizing to (a) or the complement thereof; d) capable of hybridizingto a nucleic acid comprising 50 to 200 or more consecutive nucleotidesof a nucleotide sequence given in SEQ ID NOs: 69-187, or the complementthereof; e) complementary to (a), (b) or (c); and f) which is thereverse complement of (a), (b) or (c).
 5. The polynucleotide of claim 3comprising a nucleotide sequence encoding a polypeptide with an activityof a small and large subunit ADPG pyrophosphorylase, respectively, whichnucleotide sequence is substantially similar to a nucleic acid sequenceencoding a polypeptide as given in SEQ ID NOs: 136-142 or apartial-length polypeptide having substantially the same activity as thefull-length polypeptide, e.g., at least 50%, more preferably at least80%, even more preferably at least 90% to 95% the activity of thefull-length polypeptide.
 6. The polynucleotide of claim 5 comprising anucleotide sequence a) as given in any one of SEQ ID NOs: 135-141 or apart thereof which still encodes a partial-length polypeptide havingsubstantially the same activity as the full-length polypeptide, e.g., atleast 50%, more preferably at least 80%, even more preferably at least90% to 95% the activity of the full-length polypeptide; b) havingsubstantial similarity to (a); c) capable of hybridizing to (a) or thecomplement thereof; d) capable of hybridizing to a nucleic acidcomprising 50 to 200 or more consecutive nucleotides of nucleotidesgiven in SEQ ID NO: 135-141, or the complement thereof; e) complementaryto (a), (b) or (c); and f) which is the reverse complement of (a), (b)or (c).
 7. A polynucleotide according to claim 3 comprising a nucleotidesequence encoding a polypeptide involved in starch structurerearrangement, which nucleic acid molecule is substantially similar to anucleic acid encoding a polypeptide as given in SEQ ID NOs: 76-78exhibiting isoamylase debranching enzyme activity; 70-74 exhibiting abranching enzyme activity, 80-92 exhibiting an α-amylase activity;94-100 exhibiting an α-amylase inhibitor activity; 110 exhibiting apullulanase activity; 102-108 exhibiting a O-amylase activity; 112-118exhibiting a a-glucosidase activity, or a partial-length polypeptidehaving substantially the same activity as the full-length polypeptide,e.g., at least 50%, more preferably at least 80%, even more preferablyat least 90% to 95% the activity of the full-length polypeptide.
 8. Thepolynucleotide of claim 7, comprising a nucleotide sequence a) as givenin any one of SEQ ID NOs: 75-77 exhibiting isoamylase debranching enzymeactivity; 69-73 exhibiting a branching enzyme activity, 79-91 exhibitingan α-amylase activity; 93-99 exhibiting an α-amylase inhibitor activity;109 exhibiting a pullulanase activity; 101-107, exhibiting a β-amylaseactivity; 111-117 or a part thereof which still encodes a partial-lengthpolypeptide having substantially the same activity as the full-lengthpolypeptide, e.g., at least 50%, more preferably at least 80%, even morepreferably at least 90% to 95% the activity of the full-lengthpolypeptide; b) having substantial similarity to (a); c) capable ofhybridizing to (a) or the complement thereof; d) capable of hybridizingto a nucleic acid comprising 50 to 200 or more consecutive nucleotidesof a nucleotide sequence given in SEQ ID NOs: 75-77 exhibitingisoamylase debranching enzyme activity; 69-73 exhibiting a branchingenzyme activity, 79-91 exhibiting an α-amylase activity; 93-99exhibiting an α-amylase inhibitor activity; 109 exhibiting a pullulanaseactivity; 101-107, exhibiting a β-amylase activity; 111-117; e)complementary to (a), (b) or (c); and f) which is the reverse complementof (a), (b) or (c).
 9. A polynucleotide according to claim 3 comprisinga nucleotide sequence encoding a polypeptide exhibiting an amylase or anamylase inhibitor activity, which nucleic acid molecule is substantiallysimilar to a nucleic acid encoding a polypeptide as given in SEQ ID NOs:80-92 exhibiting an α-amylase activity; and 94-100 exhibiting anα-amylase inhibitor activity, or a partial-length polypeptide havingsubstantially the same activity as the full-length polypeptide, e.g., atleast 50%, more preferably at least 80%, even more preferably at least90% to 95% the activity of the full-length polypeptide.
 10. Thepolynucleotide of claim 9 comprising a nucleotide sequence a) as givenin any one of SEQ ID NOs: 79-91 exhibiting an α-amylase activity; and93-99 exhibiting an α-amylase inhibitor activity or a part thereof whichstill encodes a partial-length polypeptide having substantially the sameactivity as the full-length polypeptide, e.g., at least 50%, morepreferably at least 80%, even more preferably at least 90% to 95% theactivity of the full-length polypeptide; b) having substantialsimilarity to (a); c) capable of hybridizing to (a) or the complementthereof; d) capable of hybridizing to a nucleic acid comprising 50 to200 or more consecutive nucleotides of a nucleotide sequence given inSEQ ED NOs: 79-91 exhibiting an α-amylase activity; and 93-99 exhibitingan α-amylase inhibitor activity, or the complement thereof; e)complementary to (a), (b) or (c); and f) which is the reverse complementof (a), (b) or (c).
 11. A polynucleotide according to claim 3 comprisinga nucleotide sequence encoding a polypeptide exhibiting a sucrosesynthase activity, which nucleic acid molecule is substantially similarto a nucleic acid encoding a polypeptide as given in SEQ ID NOs: 120-128or a partial-length polypeptide having substantially the same activityas the full-length polypeptide, e.g., at least 50%, more preferably atleast 80%, even more preferably at least 90% to 95% the activity of thefull-length polypeptide.
 12. The polynucleotide of claim 11 comprising anucleotide sequence a) as given in any one of SEQ ID NOs: 119-127 or apart thereof which still encodes a partial-length polypeptide havingsubstantially the same activity as the full-length polypeptide, e.g., atleast 50%, more preferably at least 80%, even more preferably at least90% to 95% the activity of the full-length polypeptide; b) havingsubstantial similarity to (a); c) capable of hybridizing to (a) or thecomplement thereof; d) capable of hybridizing to a nucleic acidcomprising 50 to 200 or more consecutive nucleotides of a nucleotidesequence given in SEQ ID NOs: 119-127 or the complement thereof; e)complementary to (a), (b) or (c); and f) which is the reverse complementof (a), (b) or (c).
 13. A polynucleotide according to claim 3 comprisinga nucleotide sequence encoding a polypeptide exhibiting a glucanaseactivity, which nucleic acid molecule is substantially similar to anucleic acid encoding a polypeptide as given in SEQ ID NOs: 192 or apartial-length polypeptide having substantially the same activity as thefull-length polypeptide, e.g., at least 50%, more preferably at least80%, even more preferably at least 90% to 95% the activity of thefull-length polypeptide.
 14. The polynucleotide of claim 13 comprising anucleotide sequence a) as given in SEQ ID NO: 191 or a part thereofwhich still encodes a partial-length polypeptide having substantiallythe same activity as the full-length polypeptide, e.g., at least 50%,more preferably at least 80%, even more preferably at least 90% to 95%the activity of the full-length polypeptide; b) having substantialsimilarity to (a); c) capable of hybridizing to (a) or the complementthereof; d) capable of hybridizing to a nucleic acid comprising 50 to200 or more consecutive nucleotides of nucleotides given in SEQ ID NO:191 or the complement thereof; e) complementary to (a), (b) or (c); andf) which is the reverse complement of (a), (b) or (c).
 15. Apolynucleotide comprising a nucleotide sequence encoding a seed storageprotein, which nucleic acid molecule is substantially similar to anucleic acid encoding a polypeptide as given in SEQ ID NOs: 212-250 or apartial-length polypeptide having substantially the same activity as thefull-length polypeptide, e.g., at least 50%, more preferably at least80%, even more preferably at least 90% to 95% the activity of thefull-length polypeptide.
 16. The polynucleotide of claim 15 comprising anucleotide sequence a) as given in any one of SEQ ID NOs: 211-249 or apart thereof which still encodes a partial-length polypeptide havingsubstantially the same activity as the full-length polypeptide, e.g., atleast 50%, more preferably at least 80%, even more preferably at least90% to 95% the activity of the full-length polypeptide; b) havingsubstantial similarity to (a); c) capable of hybridizing to (a) or thecomplement thereof; d) capable of hybridizing to a nucleic acidcomprising 50 to 200 or more consecutive nucleotides of a nucleotidesequence given in any one of SEQ ID NOs: 211-249 or the complementthereof; e) complementary to (a), (b) or (c); and f) which is thereverse complement of (a), (b) or (c).
 17. The polynucleotide of claim15 comprising a nucleotide sequence encoding a glutelin protein theexpression of which is up-regulated during grain filling, which nucleicacid molecule is substantially similar to a nucleic acid encoding apolypeptide as given in SEQ ID NOs: 224, 236, and 240 or apartial-length polypeptide having substantially the same activity as thefull-length polypeptide, e.g., at least 50%, more preferably at least80%, even more preferably at least 90% to 95% the activity of thefull-length polypeptide.
 18. The polynucleotide of claim 17 comprising anucleotide sequence a) as given in any one of SEQ ID NOs: 223, 235, and239 or a part thereof which still encodes a partial-length polypeptidehaving substantially the same activity as the full-length polypeptide,e.g., at least 50%, more preferably at least 80%, even more preferablyat least 90% to 95% the activity of the full-length polypeptide; b)having substantial similarity to (a); c) capable of hybridizing to (a)or the complement thereof; d) capable of hybridizing to a nucleic acidcomprising 50 to 200 or more consecutive nucleotides of a nucleotidesequence given in any one of SEQ ID NOs: 223, 235, and 239, or thecomplement thereof; e) complementary to (a), (b) or (c); and f) which isthe reverse complement of (a), (b) or (c).
 19. A polynucleotideaccording to claim 15 comprising a nucleotide sequence encoding aprolamin protein the expression of which is up-regulated during grainfilling, which nucleotide sequence is substantially similar to a nucleicacid sequence encoding a polypeptide as given in SEQ HD NOs: 218, 220,226 and 242 or a partial-length polypeptide having substantially thesame activity as the full-length polypeptide, e.g., at least 50%, morepreferably at least 80%, even more preferably at least 90% to 95% theactivity of the full-length polypeptide.
 20. The polynucleotide of claim19 comprising a nucleotide sequence a) as given in any one of SEQ IDNOs: 217, 219, 225 and 241 or a part thereof which still encodes apartial-length polypeptide having substantially the same activity as thefull-length polypeptide, e.g., at least 50%, more preferably at least80%, even more preferably at least 90% to 95% the activity of thefull-length polypeptide; b) having substantial similarity to (a); c)capable of hybridizing to (a) or the complement thereof; d) capable ofhybridizing to a nucleic acid comprising 50 to 200 or more consecutivenucleotides of a nucleotide sequence given in any one of SEQ ID NOs:217, 219, 225 and 241, or the complement thereof; e) complementary to(a), (b) or (c); and f) which is the reverse complement of (a), (b) or(c).
 21. A polynucleotide according to claim 15 comprising a nucleotidesequence encoding a gliadin protein, the expression of which isup-regulated during grain filling, which nucleotide sequence issubstantially similar to a nucleic acid sequence encoding a polypeptideas given in SEQ ID NOs: 212, 219; 234, 248; and 250 or a partial-lengthpolypeptide having substantially the same activity as the full-lengthpolypeptide, e.g., at least 50%, more preferably at least 80%, even morepreferably at least 90% to 95% the activity of the full-lengthpolypeptide.
 22. The polynucleotide of claim 21 comprising a nucleotidesequence a) as given in any one of SEQ ID NOs: 211,220; 233, 247; and249 or a part thereof which still encodes a partial-length polypeptidehaving substantially the same activity as the full-length polypeptide,e.g., at least 50%, more preferably at least 80%, even more preferablyat least 90% to 95% the activity of the full-length polypeptide; b)having substantial similarity to (a); c) capable of hybridizing to (a)or the complement thereof; d) capable of hybridizing to a nucleic acidcomprising 50 to 200 or more consecutive nucleotides of a nucleotidesequence given in any one of SEQ ID NOs: 135325; 135133; 10825,135101;and 135103, or the complement thereof; e) complementary to (a), (b) or(c); and f) which is the reverse complement of (a), (b) or (c).
 23. Apolynucleotide the expression of which is up-regulated during grainfilling comprising a nucleotide sequence encoding a polypeptide that isinvolved in or associated with fatty acid synthesis or lipid metabolism,which nucleotide sequence is substantially similar to a nucleic acidsequence encoding a polypeptide as given in SEQ ID NOs: 252-280 or apartial-length polypeptide having substantially the same activity as thefull-length polypeptide, e.g., at least 50%, more preferably at least80%, even more preferably at least 90% to 95% the activity of thefull-length polypeptide.
 24. The polynucleotide of claim 23 comprising anucleotide sequence a) as given in any one of SEQ ID NOs: 251-279 or apart thereof which still encodes a partial-length polypeptide havingsubstantially the same activity as the full-length polypeptide, e.g., atleast 50%, more preferably at least 80%, even more preferably at least90% to 95% the activity of the full-length polypeptide; b) havingsubstantial similarity to (a); c) capable of hybridizing to (a) or thecomplement thereof; d) capable of hybridizing to a nucleic acidcomprising 50 to 200 or more consecutive nucleotides of nucleotidesgiven in any one of SEQ ID NOs: 251-279 or the complement thereof; e)complementary to (a), (b) or (c); and f) which is the reverse complementof (a), (b) or (c).
 25. A polynucleotide according to claim 23comprising a nucleotide sequence encoding an oleosin protein, whichnucleotide sequence is substantially similar to a nucleic acid sequenceencoding a polypeptide as given in SEQ ID NOs: 258 and 260 or apartial-length polypeptide having substantially the same activity as thefull-length polypeptide, e.g., at least 50%, more preferably at least80%, even more preferably at least 90% to 95% the activity of thefull-length polypeptide.
 26. The polynucleotide of claim 25 comprising anucleotide sequence a) as given in any one of SEQ ID NOs: 257 and 259 ora part thereof which still encodes a partial-length polypeptide havingsubstantially the same activity as the full-length polypeptide, e.g., atleast 50%, more preferably at least 80%, even more preferably at least90% to 95% the activity of the full-length polypeptide; b) havingsubstantial similarity to (a); c) capable of hybridizing to (a) or thecomplement thereof; d) capable of hybridizing to a nucleic acidcomprising 50 to 200 or more consecutive nucleotides of a nucleotidesequence given in any one of SEQ ID NOs: 257 and 259, or the complementthereof; e) complementary to (a), (b) or (c); and f) which is thereverse complement of (a), (b) or (c).
 27. A polynucleotide according toclaim 23 comprising a nucleotide sequence encoding a polypeptide theactivity of which is involved in or associated with the dehydrogenationof phytoene and the expression of which is up-regulated during grainfilling, which nucleotide sequence is substantially similar to a nucleicacid sequence encoding a polypeptide as given in SEQ ID NO: 278 or apartial-length polypeptide having substantially the same activity as thefull-length polypeptide, e.g., at least 50%, more preferably at least80%, even more preferably at least 90% to 95% the activity of thefull-length polypeptide.
 28. The polynucleotide of claim 27 comprising anucleotide sequence a) as given in any one of SEQ ID NOs: 277 or a partthereof which still encodes a partial-length polypeptide havingsubstantially the same activity as the full-length polypeptide, e.g., atleast 50%, more preferably at least 80%, even more preferably at least90% to 95% the activity of the full-length polypeptide; b) havingsubstantial similarity to (a); c) capable of hybridizing to (a) or thecomplement thereof; d) capable of hybridizing to a nucleic acidcomprising 50 to 200 or more consecutive nucleotides of anucleotide-sequence given in any one of SEQ ID NOs: 277, or thecomplement thereof; e) complementary to (a), (b) or (c); and f) which isthe reverse complement of (a), (b) or (c).
 29. A polynucleotidecomprising a nucleotide sequence that encodes a polypeptide that acts asa transcription factor and the expression of which is up-regulatesduring grain filling, which nucleotide sequence is substantially similarto a nucleic acid sequence encoding a polypeptide as given in SEQ IDNOs: 302-328 or a partial-length polypeptide having substantially thesame activity as the full-length polypeptide, e.g., at least 50%, morepreferably at least 80%, even more preferably at least 90% to 95% theactivity of the full-length polypeptide.
 30. The polynucleotide of claim29 comprising a nucleotide sequence a) as given in any one of SEQ IDNOs: 301-327 or a part thereof which still encodes a partial-lengthpolypeptide having substantially the same activity as the full-lengthpolypeptide, e.g., at least 50%, more preferably at least 80%, even morepreferably at least 90% to 95% the activity of the full-lengthpolypeptide; b) having substantial similarity to (a); c) capable ofhybridizing to (a) or the complement thereof; d) capable of hybridizingto a nucleic acid comprising 50 to 200 or more consecutive nucleotidesof a nucleotide sequence given in any one of SEQ ID NOs: 301-327, or thecomplement thereof; e) complementary to (a), (b) or (c); and f) which isthe reverse complement of (a), (b) or (c).
 31. A polynucleotidecomprising a nucleotide sequence encoding a polypeptide the activity ofwhich is involved or associated with the metabolism of amino acids andthe expression of which is up-regulated during grain filling, whichnucleotide sequence is substantially similar to a nucleic acid sequenceencoding a polypeptide as given in SEQ ID NOs: 282-300 or apartial-length polypeptide having substantially the same activity as thefull-length polypeptide, e.g., at least 50%, more preferably at least80%, even more preferably at least 90% to 95% the activity of thefull-length polypeptide.
 32. The polynucleotide of claim 31 comprising anucleotide sequence a) as given in any one of SEQ ID NOs: 281-299 or apart thereof which still encodes a partial-length polypeptide havingsubstantially the same activity as the full-length polypeptide, e.g., atleast 50%, more preferably at least 80%, even more preferably at least90% to 95% the activity of the full-length polypeptide; b) havingsubstantial similarity to (a); c) capable of hybridizing to (a) or thecomplement thereof; d) capable of hybridizing to a nucleic acidcomprising 50 to 200 or more consecutive nucleotides of a nucleotidesequence given in any one of SEQ ID NOs: δ 281-299, or the complementthereof; e) complementary to (a), (b) or (c); and f) which is thereverse complement of (a), (b) or (c).
 33. A polypeptide which has anamino acid sequence encoded by any one of the polynucleotides accordingto claim
 1. 34. A polypeptide according to claim 33, which has an aminoacid sequence encoded by a polynucleotide selected from the groupconsisting of SEQ ID NOs: 1 to 461, 501-511, and 513-641.
 35. Apolypeptide according to claim 33 wherein said polypeptide has at least90% amino acid sequence identity to a polynucleotide selected from thegroup consisting of SEQ ID NOs: 2-462, 502-512, and 514-642.
 36. Anisolated nucleic acid molecule comprising a nucleotid sequence, whichnucleotide sequence is obtained or obtainable from plant genomic DNAcomprising a gene having an open reading frame (ORF) encoding apolypeptide which has at least between 70%, and 99% amino acid sequenceidentity to a polypeptide encoded by an Oryza, e.g., Oryza saliva, genecomprising a nucleotide sequence as given in SEQ ID NOs: 1 to 461,501-511, and 513-641.
 37. A recombinant vector comprising apolynucleotide of claim
 1. 38. An expression cassette comprising asoperably linked components, a promoter, a polynucleotide of claim 1 anda termination sequence.
 39. A host cell comprising the expressioncassette of claim
 38. 40. The host cell of claim 39 wherein said hostcell is a bacterial cell, a yeast cell, an animal cell or a plant cell.41. The host cell of claim 40, wherein said plant cell is from a cerealplant.
 42. A plant comprising a host cell of claim
 39. 43. A plantaccording to claim 42, wherein said plant is selected from the groupconsisting of maize, soybean, barley, alfalfa, sunflower, tomato,banana, canola, cotton, peanut, sorghum, tobacco, sugarbeet, wheat, andrice.
 44. A method of modulating carbohydrate composition of the plantgrain, comprising functionally integrating an isolated nucleic acidmolecule according to claim 1 comprising a nucleic acid sequenceencoding a polypeptide, which is involved in or associated with thesynthesis, metabolism or degradation of carbohydrates in the plant grainand the expression of which is up-regulated during grain filling, into acell, group of cells, tissue or organ of a plant.
 45. A method ofmodulating the protein content and composition of the plant grain,comprising functionally integrating an isolated nucleic acid moleculeaccording to claim 15 comprising a nucleic acid sequence encoding apolypeptide, which is involved in or associated with the synthesis,metabolism or degradation of seed storage proteins in the plant grainand the expression of which is up-regulated during grain filling, into acell, group of cells, tissue or organ of a plant.
 46. A method ofmodulating the fatty acid and/or lipid content and composition of theplant grain, comprising functionally integrating an isolated nucleicacid molecule according to claim 23 comprising a nucleic acid sequenceencoding a polypeptide, which is involved in or associated with fattyacid synthesis or lipid metabolism in the plant grain and the expressionof which is up-regulated during grain filling, into a cell, group ofcells, tissue or organ of a plant.
 47. A method of modulating the grainfilling process of the plant grain, comprising functionally integratingan isolated nucleic acid molecule according to claim 28 comprising anucleic acid sequence encoding a transcription factor polypeptide, whichis involved in or associated with the regulation and coordination ofgrain filling in plants and the expression of which is up-regulatedduring grain filling, into a cell, group of cells, tissue or organ of aplant.
 48. A method of modulating the amino acid content and compositionof the plant grain, comprising functionally integrating an isolatednucleic acid molecule according to claim 31 comprising a nucleic acidsequence encoding a polypeptide the activity of which is involved orassociated with the metabolism of amino acids and the expression ofwhich is up-regulated during grain filling, into a cell, group of cells,tissue or organ of a plant.
 49. A method of modulating nutrient contentand composition of the plant grain, comprising: a) functionallyintegrating i. an isolated nucleic acid molecule according to claim 1,or a portion thereof in an anti-sense orientation; or ii. an dsRNAiconstruct comprising an isolated nucleic acid molecule according toclaim 1, or a portion thereof in both a sense and an anti-senseorientation, which, optionally, may be separated by a spacer region;under the transcriptional control of regulatory sequences required forexpression in plants, into a cell, group of cells, tissue or organ of aplant; and b) expressing the constructs as provided in a) above in acell, group of cells, a tissue or organ of a plant to produce a RNAtranscript.
 50. A method of identifying or isolating polynucleotidesequences that are orthologous to a nucleic acid molecule according toclaim 1 comprising a nucleic acid fragment encoding a polypeptide thatis up-regulated during grain filling, from the genome of another plant,wherein all or a portion of a particular nucleic acid sequence accordingto claim 1 is used as a probe that selectively hybridizes to genesequences present in a population of cloned genomic DNA fragments orcDNA fragments from a chosen source organism.
 51. A method to identify anucleic acid molecule encoding a polypeptide the expression of which isup-regulated during grain filling a) contacting a plurality of isolatednucleic acid samples comprising all or a portion of a particular nucleicacid sequence according to claim 1 on a solid substrate with a probecomprising plant nucleic acid corresponding to RNA isolated from aspecific plant tissue during grain filling so as to form a complex,wherein each sample comprises a plurality of oligonucleotidescorresponding to at least a portion of one plant gene; and b) contactinga second plurality of isolated nucleic acid samples comprising all or aportion of a particular nucleic acid sequence according to claim 1 to ona solid substrate with a second probe comprising plant nucleic acidcorresponding to RNA that is taken at a different development stage ofthe plant; c) comparing complex formation in a) with complex formationin b) so as to identify which samples correspond to genes that areexpressed during grain filling.
 52. A method for detecting the presenceof a polynucleotide according to claim 1, or a fragment or a variantthereof, or a complementary sequence thereto in a sample, the methodincluding the following steps of: a) bringing into contact a nucleotideprobe or a plurality of nucleotide probes which can hybridize with apolynucleotide according to claim 1, or a fragment or a variant thereof,or a complementary sequence thereto and the sample to be assayed. b)detecting the hybrid complex formed between the probe and a nucleotidein the sample.
 53. A kit for detecting the presence of a polynucleotideaccording to claim 1, or a fragment or a variant thereof, or acomplementary sequence thereto in a sample, the kit including anucleotide probe or a plurality of nucleotide probes which can hybridizewith a nucleotide sequence comprised within a polynucleotide accordingto claim 1, or a fragment or a variant thereof, or a complementarysequence thereto and, optionally, the reagents necessary for performingthe hybridization reaction.
 54. A method of modifying the frequency of agrain filling gene in a plant population, comprising the steps of: a)screening a plurality of plants using an oligonucleotide as a marker todetermine the presence or absence of a grain filling gene in anindividual plant, the oligonucleotide consisting of not more than 300bases of a nucleotide sequence selected from the group consisting of SEQID NOs 1 to SEQ ID NO: 461, b) selecting at least one individual plantfor breeding based on the presence or absence of the grain filling gene;and c) breeding at least one plant thus selected to produce a populationof plants having a modified frequency of the grain filling gene.
 55. Amethod according to claim 54, wherein the oligonucleotide comprises asimple sequence repeat (SSR) sequence comprising at least twoconsecutive repeat units of an SSR, the start and end points of whichare provided in Tables 2 and 3, and a flanking sequence of at leastabout 14 nucleic acids immediately adjacent to said at least twoconsecutive repeat units.
 56. A method of plant breeding to select foror against a trait of interest which is associated with grain filling inplants, comprising the steps of: a. identifying the trait of interest;identifying at least one oligonucleotide that can be used as a markerfor the trait, the oligonucleotide consisting of not more than 300 basesof a nucleotide sequence selected from the group consisting of SEQ IDNOs: 1 to SEQ ID NO: 461, b. screening at least one plant for thepresence of the at least one oligonucleotide; c. selecting at least oneplant based on presence or absence of the at least one oligonucleotide;d. breeding at least one plant thus selected to produce a population ofplants having a modified frequency of the at least one oligonucleotide;and e. screening at least one plant of the population for the presenceor absence of the grain filling trait.
 57. A method according to claim56, wherein the oligonucleotide comprises a simple sequence repeat (SSR)sequence comprising at least two consecutive repeat units of an SSR, thestart and end points of which are provided in Tables 2 and 3, and aflanking sequence of at least about 14 nucleic acids immediatelyadjacent to said at least two consecutive repeat units.
 58. A method ofdetermining a varietal identity of a plant, comprising: a) obtaining anucleic acid sample from a plant; b) identifying at least oneoligonucleotide to obtain an oligonucleotide profile for the plant,wherein the oligonucleotide consists of not more than 300 bases of anucleotide sequence selected from the group consisting of SEQ ID NOs: 1to SEQ ID NO: 461, the oligonucleotide comprising a simple sequencerepeat (SSR) sequence comprising at least two consecutive repeat unitsof an SSR, the start and end points of which are provided in Tables 2and 3, and a flanking sequence of at least about 14 nucleic acidsimmediately adjacent to said at least two consecutive repeat units inthe sample; and c) comparing the SSR profile to at least one known SSRprofile corresponding to at least one known variety to determine thevarietal identity of the plant.
 59. An oligonucleotide primer consistingof between 8 and 150 bases which comprises at least 14 bases selectedfrom the group of flanking sequences obtainable from a nucleotidesequence provided in SEQ ID NOs: 3435 to SEQ ID NO: 150133, which atleast 14 bases are immediately adjacent to at least two consecutiverepeat units of an SSR, the start and end points of which are providedin Tables 2 and
 3. 60. A computer-readable medium having stored thereona data structure comprising: a) Sequence information of a polynucleotideaccording to claim 1; 15-22; 23-28; 28-30 and 31 to 32 and/or; and apolynucleotide according to any one of claims . . . to . . . . b) amodule receiving the nucleic acid molecule which compares the nucleicacid sequence of the molecule to at least one other nucleic acidsequence.